The current online help is aimed at providing essential information on using the OECD QSAR Toolbox for Grouping Chemicals into Categories. The main objective of the Toolbox is to allow the user to use (Q)SAR methodologies to group chemicals into categories and to fill data gaps by read-across, trend analysis and (Q)SARs. For in-depth background information on the concept of chemical categories, the user is invited to consult the guidance document for grouping of chemicals published in the Series on Testing and Assessment of the OECD Environment, Health and Safety Publications [OECD (2007); ENV/JM/MONO(2007)28: http://www.oecd.org/officialdocuments/displaydocument/?doclanguage=en&cote=env/jm/mono(2007)28].
Additional guidance and training material are available on the dedicated internet site for the QSAR Toolbox [http://www.qsartoolbox.org], the internet site for the OECD (Q)SAR Project [http://www.oecd.org/env/existingchemicals/qsar] as well as the internet site of the developer of the QSAR Toolbox [http://toolbox.oasis-lmc.org/]. The user is invited to regularly consult these internet sites.
The QSAR Toolbox is a project of the Organisation for Economic Co-Operation and Development in collaboration with the European Chemical Agency. It has been developed by the Laboratory of Mathematical Chemistry.
The development of the QSAR Toolbox is a large collaborative effort and many scientific teams and stakeholders are donating their skills and tools to be integrated into the Toolbox [see http://www.oecd.org/env/chemicalsafetyandbiosafety/assessmentofchemicals/donorstotheqsartoolbox.htm]
3. What is the QSAR Toolbox
The OECD QSAR Toolbox is a software designed to reduce the use of animals in laboratory tests, reduce the cost for testing and increase the number of chemicals which are assessed for their effects upon human health and the environment. The OECD QSAR Toolbox provides scientific computational methods and information technologies for application of the category approach for filling gap in experimental data that are necessary for hazard and risk assessment. By making use of the system, hazard and risk assessors are able to:
o Use predefined categories, or to refine existing or build new categories.
o Identify analogous chemicals (or category) based on user selected characteristics. Categorize chemicals accounting for their metabolism: rate of disappearance, formation of stable metabolites, formation of high reactive intermediates, deactivation pathways, etc.
o Extract all available experimental or pre-calculated data from local and remote (web) based databases accompanied with information about their reliability: experimental error, analytical or computational method used, replicates, etc.
o Fill the gaps of missing information within the category by making use of chemometrics approaches such as read across, trend analysis, and (Q)SAR models.
QSAR predictions are accompanied with information concerning their mechanistic background, training chemicals, statistics, applicability domain and validity.
The OECD QSAR Toolbox is an expandable application that navigates the information flows between all of the installed components (modules): computational tools, database managers, (Q)SAR libraries, categorization models, etc.
II. User interface
The interface of the Toolbox is designed to follow the typical workflow for predicting endpoint(s) for a given chemical (named a target chemical). It represents the main six stages of the workflow (Input, Profiling, Endpoint, Category definition, Data gap filling and Reporting) on a toolbar (1), which is situated on the uppermost part of the application’s window (Figure 1). Below the stages toolbar there is another toolbar – the actions toolbar (2). It provides the most important actions, which are related to the current stage. On the left part of the main window is the stage options panel (3). It provides specific content for the current stage and actions related to this content. The biggest part of the main form is occupied by the data matrix (4). It is available in all stages, except Reporting and shows the queried data, both experimental and predicted for the chemicals loaded into the system.
1. Stages toolbar
The stages toolbar is a steady part of the Toolbox interface. It allows easy navigation between main stages of the program's workflow. Each stage is represented by a toolbar button, which invokes the interface related to the current stage. Some examples are provided below. (Figure 2)
The basic actions of the stage "Endpoint" (Figure 6)
3. Stage options panel
The stage option panel provides specific content for the current stage and actions related to this content. Each stage has its specific functions and that is why the stage option panel has different content. Some examples are provided below:
Input: The stage option panel in the Input stage gives the list with work documents, content of the documents. It also provides two approaches for multiplication of the target structures – multiplication by tautomerism and multiplication by metabolism.
Metabolism can be applied via 3 observed and 9 simulated metabolism simulators:
• Observed Liver metabolism
• Observed Mammalian metabolism
• Observed Microbial metabolism
• Autoxidation simulator
• Dissociation simulator
• Hydrolysis (Acidic)
• Hydrolysis (Acidic)
• Hydrolysis (Acidic)
• Liver metabolism
• Microbial metabolism simulator
• Skin metabolism simulator
Multiplication by Metabolism – select Skin metabolism simulator
In order to accomplish multiplication of the loaded target structure the user should apply a right click on the SMILES of the target in the stage option panel (1), select Multiplication from the pop-up menu (2), then press Metabolism (3) and finally select a simulator, for example Skin metabolism simulator (4). (Figure 7)
Double click on the tautomeric/metabolic (1) set to invoke a window displaying the pictures of the set’s constituents. (Figure 9)
4. Data matrix
Below is a snapshot displaying the data matrix window (1). (Figure 10)
Experimental data available in Toolbox databases is assigned to the other four general nodes and their subnodes. These four nodes are separated in four basic sections depending on the type of the assigned experimental data.
For example results for melting point or partition coefficients (Figure 13) are assigned to the nodes Melting/Freezing Point or Partition Coefficient, which are sub-nodes of the node Physical Chemical Properties (1),
or data associated with the Ames test or Chromosomal aberration are assigned to the nodes Bacterial Reverse Mutation Assay (e.g. Ames Test) and In Vitro Mammalian Chromosome Aberration Test, which are subnodes of the node Human Health Hazards (2) (Figure 14)
22.214.171.124. Construction of Endpoint tree
The Endpoint tree is constructed in two parts: a predefined and a dynamic part. The predefined part is rigid and cannot be reordered while the dynamic part is flexible and can be reordered. This functionality is implemented due to the diversity of experimental data available from different databases. To check which part is predefined and which part is dynamic you should press the Ctrl key and the predefined part of the tree will be underscored.
• Predefined part (Figure 15)
• Dynamic part (Figure 16)
The metadata fields associated with the experimental data is used to build the dynamic part of the endpoint tree. So in this case in vitro and in vivo are elements of the metadata field called Type of method (1) (Figure 17)
The next node Bacterial reverse mutation assay (e.g Ames test) is the Test type (2) (Figure 18)
The subsequent two nodes Gene mutation and Salmonella typhimurium are associated with the field Type of genotoxicity and the field Test organism (species) (3). (Figure 19)
The last two nodes are associated with the following two fields: Metabolic activation and Strain (4). (Figure 20)
Each of these fields can be reordered using the Set tree hierarchy functionality. This option is available by applying a right mouse click over the node where the corresponding hierarchy should be reordered (1) and then clicking on Set tree hierarchy (2) from the context menu. The little blue triangle appears on the level of the node to which a hierarchy is set (3). (Figure 21)
The Set tree hierarchy window appears.
126.96.36.199. Set tree hierarchy functionality
It contains two panels: Metadata labels (1) and Sub-nodes (2) (Figure 22)
The Toolbox comes with default hierarchy. The panel with Metadata labels contains a list with most usable fields. If the user wants to set another field as a sub-node he/she should check the Show all labels box (1), then the list with all available labels available in different databases appears (2). (Figure 23)
The sequence of fields (1) displayed in the Sub-nodes panel specifies the organization of the nodes of the endpoint tree (2). (Figure 24)
The user can add or remove fields already specified as sub-nodes using the auxiliary buttons (1) (Figure 25)
The user can reorder the sub-nodes using the Up and Down buttons (2) (Figure 26)
If the user wants to reset the default setting of the endpoint tree then he/she can click the Default button (3). (Figure 27)
The changes in the endpoint hierarchy are confirmed by pressing the OK button.
188.8.131.52. Filtering nodes of the Endpoint tree
The nodes of the endpoint tree can be filtered using the Filter endpoint tree… functionality. In order to filter the endpoint tree, the user should write the desired query in the blank field named Filter endpoint tree…(1) then the white field Filter endpoint tree…becomes green colored (2) indicating that the endpoint tree is filtered (3) (for instance write “skin”, then the endpoint tree is filtered and only nodes related to skin are visible) (Figure 28 -29)
When the user deletes the defined query then the system restore the default settings of the endpoint tree.
184.108.40.206. Sorting and filtering data assigned to a defined node
There is a functionality which allows sorting experimental data for a given row displayed in the data matrix. The user should right click (1) over the node with data which is the object of filtering and then select one of the following options:
• Sort (targets priority) – by this option the chemicals are sorted by experimental data into descending or ascending order, taking into account the priority of the target chemical (2). The latter means that the target will stay in the first
column and the other chemicals will be placed after the target in descending or ascending order.(Figure 30)
• Sort – by this option chemicals are sorted in descending or ascending order without taking into account the priority of the target chemical (3). (Figure 31)
• Function – this functionality displays the minimal, maximal or average values if more than one experimental data are available for a chemical. This functionality works for data on a given row (4). (Figure 32)
220.127.116.11. Tips related to the Endpoint tree area
Some additional features are available by applying a right mouse click over the area of the endpoint tree
• Hidden nodes
The functionality to view hidden nodes of the endpoint tree is available. The 2D and 3D parameters are hidden nodes. They are listed in two separate nodes. To visualize the list with parameters, the user should right click over the endpoint tree and select Show hidden (1). (Figure 33)
Then a list with nodes with 2D and 3D parameters appears (1). Hidden nodes are in blue font. (Figure 34)
The parameters are listed as subnodes 2D and 3D. Calculating the desired parameter is possible when the user clicks the right mouse button over the desired parameter (1) and selects one of the available options: Calculate /Extract for all chemicals or Calculate all parameters (2). These two options are used for the calculation of the selected parameter for all chemicals loaded in the data matrix. (Figure 35)
Calculation of a parameter for one specific chemical is possible when user hovers over the cell of the chemical corresponding to the desired parameter (1) and from the popup menu (right mouse click) (2) selects one of the options (3): Calculate ….. or Calculate/Extract all 2D parameters. (Figure 36)
• Supporting functionalities
o Collapse all – this option allows to collapse all expanded nodes on the endpoint tree (1) (Figure 37)
o Export – this option allows to export data from a row of the data matrix (1). (Figure 38)
o Export CAS list – this option allows to export a list with CAS numbers of chemicals loaded in the data matrix (1) (Figure 39)
o Wiki search species – allows to search for test organisms in Wikipedia (1). (Figure 41)
o Copy path – this option allows to copy the endpoint path (1) (Figure 41)
4.2. Area with selected chemicals (2)
The chemicals which are loaded in the system appear in the data matrix ordered in separate columns. There are identification labels for tautomers and mixtures. The identification label for tautomers is “T” (1) (Figure 42), while the mixtures are labeled with “Mix” (2). (Figure 43)
Tautomers label “T”
Mixture label “Mix”
A tautomeric set is indicated by the tautomeric label “T” (1) and the number of tautomers which belong to the tautomeric set (2). (Figure 44)
All tautomers from a given tautomeric set can be visualized with a double click over the field with the molecular structure (1).A window with all available tautomers opens (2): (Figure 45)
There is a filter option here, which allows ignoring specific chemical(s) with experimental data. This is possible when the user right clicks (1) over the area of the chemical which should be ignored. The user should right click somewhere in the red enclosed area (1). (Figure 49)