A. Basic information
This module provides the user with several means of grouping chemicals into a toxicologically meaningful category that includes the target molecule. This module is based on grouping methods that allow the user to group chemicals into chemical categories according to different measures of “similarity” so that within a category data gaps can be filled by read-across or trend analysis. This is the critical step in the workflow and several options are available in the Toolbox to assist the user in refining the category definition via subcategorization.
For example within a large inventory, the chemicals can be grouped according to their aquatic toxicity mode of action. Or, starting from a target chemical for which a specific mechanism of action is identified, analogues can be found which can bind by the same mechanism and for which experimental results are available. If no specific mechanisms or modes of action are identified for a target chemical, which are relevant for the investigated endpoint, then it is recommended to search for chemicals which are structurally similar to the target chemical. The search results can then be refined by eliminating those chemicals which have specific mechanisms or mode of action.
B. Grouping methods
List with grouping methods covers the list with profiling methods (Table 1):
Table 1. List with profiling and grouping methods
Summary background information for some grouping methods is listed in Table 2.
Table 2. Summary information for some grouping methods:
When searching for analogues of a target chemical, the outcome of the profiling determines the most appropriate way. The following recommendations can be made:
Figure 1. Category definition
When the grouping method executes it will provide the user with the all the categories in the selected grouping method and the categories of the target chemical (1) (if any) will populate the Target(s) profiles (2) list box. (Figure 2)
Figure 2.Target(s) profiles panel
On this stage the user has opportunity to:
• remove targets categories by selecting one of them (1) and moving it in the All profiles (2) panel using the down arrow (Figure 3)
Figure 7
The software identified 24 chemicals from the selected database(s) with same profiles as those of the target chemical.
After defining the category name, the software automatically commences a gather data action. The user can select the specific endpoint (Choose…) or by default choose to retrieve data for all endpoints (All endpoints) (see below) (Figure 8). If the user has previously selected databases related to the investigated endpoints, then both options will return same results.
Figure 8
If the user has selected all databases under the Endpoint section, and selects All endpoint the gather data operation could be very time consuming due to the diversity of endpoints and size of databases. In this respect the user is recommended to always select only those databases, and endpoint paths, which are related to the investigated endpoints.
After confirming which data to be read from databases then Repeating values dialog appears (Figure 9). This window appears due to same measured data being found for chemicals from the Toolbox databases. Data redundancies are identified and the user has the opportunity to select which data to leave and which to filter out. Buttons to select a single data (1) value or all data values (2) are also available.
Figure 9
The only difference between rows for a given chemicals is in the information for some of the metadata fields. For the case study shown on Figure 10, first chemical has same data values 4.1.105 micrograms per liter and different metadata information for “Age” field (1) (Figure 10)
Figure 10
Finally after reading data, the defined category appears under Defined Categories panel. (Figure 11)
Figure 15
Categorization of set of parent and metabolites
When the user categorized a set of parent chemical and its metabolites, all profiles of parent and metabolites are taken into account (1) (Figure 16)
Figure 16
In the Categorization panel profiles of all metabolites along with those of the parent are taken into account (1) (Figure 17)
Figure 17
4. Categorization using profiling result of hierarchical type profiling scheme:
Profiling results from hierarchical schemes such as Protein and DNA binding give information for Domain, Mechanistic alert and Structural alerts. Profiling results are visualized hierarchically: (Figure 18)
Figure 18
Toolbox gives opportunity to define a category using each of these profiling results. In case the user applies category “Domain” (Figure 19) for categorization proposes, then the software will search for chemicals which answer the criteria of category “Domain” (e.g. SN2). SN2 includes following categories:
Figure 19
So the defined category will include chemicals classified in one of these (or all together) Mechanistic alerts shown above depending on the logically operant used in defining categories
The defined category “SN2” will be broader (will include more chemicals) than the category “Nucleophilic substitution at Nitrogen atom” (Mechanistic alert), which includes only three structural alerts. (Figure 20):
Figure 20
Procedure for categorization
If the user defines category using hierarchical type grouping method, then he/she is allowed to use Domain, Mechanistic alert and Structural alert separately or simultaneously in the categorization procedure. (Figure 21)
The user can remove category related to “Structural alert”, by selecting the category (1) and moving it down (Figure 21) or he/she can leave it as is (by default):
Figure 21
In case the software doesn’t identify analogues which answer the criteria of Domain, Mechanistic alert and Structural alert categories combined by AND, then the following message will appear (Figure 22):
Figure 22
Then the user could expand the category by removing the more specific category (Structural alert) (Figure 23), and use the remaining Domain and Mechanistic alert categories.
Figure 23
E. Subcategorization
The second step of refining the broader category and defining the category of structurally and mechanistically similar analogues is the subcategorization procedure. The user can verify the mechanistic robustness of the analogue approach. If the identification of analogues was performed according to a specific mechanism or mode of action, then the target chemical and the analogues will already have the same relevant mechanisms and modes of action. Nevertheless, the analogues may also have additional mechanisms and modes of action due to additional functional groups in their molecule. In this respect subcategorization procedure is applied to refine the categories (eliminating dissimilar structures).
The broader category can be refined when subcategorization is applied. For example 216 esters are identified by US-EPA category for chemical with CAS (1) (Figure 24)
Figure 24
Figure 25
Chemical(s)/analogue(s) which have different mechanism of interaction than of the target chemical are highlighted by a blue background. Others which are not highlighted have at least one category same as those of the target.
If the user selects all categories radio button (1) (Figure 26), then all analogues included in the refined category should have all categories combined by logical conjunction (AND) (2)
Figure 26
After removing the dissimilar analogues by clicking on Remove button (1), the software defines a new category, which is subcategory of the parent category (2) (Figure 27)
Figure 27
Subcategory appears as a sub-node of the first category (1). (Figure 28) When the user selects the subcategory a new datamatrix with chemicals included in it appears (2) (Figure 28)
Figure 28
F. Combine categories
Combining categories defines a new category by combining logically members from already existing categories. The user has to click Combine button (1) then the chemicals from defined two or more categories (2) could be combined by logically AND/OR (3) operand (Figure 29)
Figure 29
After selection of the combination logic the software defines a new category including chemicals which answer the criteria of the defined category. (Figure 30)
Figure 30
G. Clustering of categories
This function allows distributing defined category into clusters. Cluster is presented as a sub-category of general category that includes chemicals with unique combination of profiling results. Clusters appear as sub-categories of general category.
How to cluster the category?
Once the category is defined (1), the user has to click Clustering button (2), then a message with numbers of generated clusters appears (3). (Figure 31)
Figure 31
Clusters of category appear as sub-nodes (1) of the general category (2). (Figure 32)
Figure 32
By clicking over the cluster (1) a new matrix with the chemicals of the current cluster appears (2). (Figure 33)
Figure 33
H. Delete category
1. Delete single category
Delete button allows the user to delete selected category. The user has to select the category to be deleted (1) and then click on the Delete button (2) (Figure 34)
Figure 34
2. Delete all categories
The user has opportunity to delete all defined category simultaneously using the Delete all (1) button. All defined categories (2) are deleted. (Figure 35)
Figure 35
After deleting all categories, only the target chemical remains on data matrix (Figure 36)
Figure 36