A. Aim of the module
The aim of the module “Filling Data Gap” is to give access to three different data gap filling tools:
• Read-across
• Trend analysis
• (Q)SAR models
Read-across and trend analysis use the available experimental data in the data matrix to fill a data gap. “(Q)SAR models” gives access to a library of external (Q)SAR models which have been integrated into the Toolbox.
Depending on the situation, the most relevant data gap mechanism should be chosen, taking into account the following considerations:
• Read-across is the appropriate data-gap filling method for “qualitative” endpoints like skin sensitization or mutagenicity for which a limited number of results are possible (e.g. positive, negative, equivocal). Furthermore read-across is
recommended for “quantitative endpoints” (e.g., 96h-LC50 for fish) if only a low number of analogues with experimental results are identified.
• Trend analysis is the appropriate data-gap filling method for “quantitative endpoints” (e.g., 96h-LC50 for fish) if a high number of analogues with experimental results are identified.
• “(Q)SAR models” can be used to fill a data gap if no adequate analogues are found for a target chemical.
When selecting read-across or trend analysis, the available data in the data matrix is used for filling a data gap. The user can further reduce the data set by using the profilers to eliminate chemicals which have different profiles compared to the target chemical.
It can be distinguished between two situations:
• If a specific mechanism or mode of action relevant for the endpoint is identified for the target chemical, then all the analogues considered should have the same mechanism or mode of action.
• If no specific mechanism or mode of action relevant for the endpoint is identified for the target chemical, then none of the structural analogues considered should have specific mechanisms or modes of actions either.
Categorical type data such as data for skin sensitization or Ames mutagenicity endpoints are calculated using read across or QSAR methods.
Figure 1. Data Gap filling procedure
Before entering data gap filling window the possible data inconsistency window appears (1) (Figure 2)
Figure 2
This feature alerts the user for possible data inconsistencies.
In the example illustrated by Figure 3, there are two fields: Assay (2) and Endpoint (3) placed below the Type of method (In vivo) filed (1) (highlighted cell) (Figure 3). Data included in these two metadata fields (Assay and Endpoint) are mixed up in data gap filling.
Figure 3
The user could filter the data-points that enter the gap filling module. In order to accelerate the work the user could use the Select all/Unselect all button from the popup-menu (Figure 4)
Figure 4
More detailed information about scales is presented in the About section. (See section Options/Unit/Edit scale definition)
Note: Only one scale/unit is allowed in data gap filling.
The number located on the bottom of the window (e.g. Selected 9/9) means that 9 data points from a total of 9 data points will enter the data gap filling module.
C. Data Gap Filling window
After clicking the OK button the Data Gap Filling module starts. The next three snapshots illustrate the different types of gap filling methods: Read across window:
Trend analysis window:
(Q)SAR models window:
D. Common features in three gap filling methods
The three different gap filling approaches, while different, have common features that all share.
• Panels over the graph
• Color legend
• Menus of data gap filling
• Right click functionality
1. Panels over graph
Descriptors panel
The “Descriptors” panel, shown in figure 5 with pop-up menu expanded. Here the user can select the descriptor that he/she likes to see as the X axis of the graphic. The Y is the value of the data-points most commonly set in logarithmic scale. The units in Y-axis could be changed in Options, subsection Units. The descriptors panel is available for Read-across, Trend analysis and (Q)SAR models data gap filling approaches. To pick a descriptor for the X axis the user has to select it (1) and then from the pop-up menu (2) click on the “Make active descriptor” (3) item.
Figure 5
Another three options are, available in the pop-up menu shown on Figure 5, are:
• Collect data: if the user wants to use a custom descriptor which is not calculated previously (all available databases in Toolbox are previously indexed and have their 2D and 3D calculations cached), he/she should click on Collect data in order to use this custom descriptor
• Change descriptor units: when this option is evoked the following window appears (Figure 6):
Figure 6
Then the user could change the dimension of the selected descriptor. Also different conversions are allowed.
• Edit descriptor option – this pop-up window allows the user to select different type of calculation for a selected descriptor. This setting is developed especially for purposes of tautomeric set prediction. (Figure 7)
Figure 7
Note: The user is allowed to set more than one descriptor in Y-axis for the purposes of the read-across method only.
Prediction panel
The “Predictions” panel, shown on Figure 8, is the same for the three data gap filling approaches with only some slight differences – for read-across it highlights the neighbor data-points with respect to descriptor used in Y-axis (by default 5 nearest which can be changed in Prediction approach options), for trend analysis it displays the trend line. For all approaches this panel shows the distribution of data-points, and the predicted value of the target chemical, on a graph. At the bottom of the panel the user has a dropdown list with descriptors (1) that can be used for the X axis, but this is for visualization purposes only, pre-calculation is getting active when the descriptor is changed in Descriptor panel. By default logKow descriptor is used (2)
Figure 8
Adequacy panel
The Adequacy panel, Figure 9, houses the adequacy graph. It is a graph where on the one axis there is the observed value and the other is the predicted value. Also Coefficient of determination (R2) and Adjusted Coefficient of determination (R2adj) (1) are displayed on the top of the graph.
Figure 9
Cumulative frequency panel
The cumulative frequency is the frequency with which the value of the residual (EP.obs – EP.calc) is less than or equal to a reference residual value.
For example, if cumulative frequency is 70% at residual value of 0.2, then 70% of training set members have residuals less or equal to 0.2. Below is the snapshot with cum. frequency graph. (Figure 10)
Figure10
Statistic panel
This panel shows the statistical characteristics of regression equation (Figure 11). The upper part of the panel includes statistics for regression equation (1), while in the second part coefficients included in the regression equation are shown (2).
Figure 11
Residuals panel
This panel includes graph on which the distribution of residuals for a given endpoint versus a specified descriptor is illustrated. (Figure 12)
Figure 12
2. Color Legend
The resulting graph plots the existing experimental results of all analogues (Y axis) according to a specified descriptor (X axis). The default descriptor is log Kow.
• Read –across: The dark red dots (1) on the graph represent the experimental data available for the analogues and which are used for the read-across. The blue dots (2) on the graph represent the experimental data available for the
analogues, which are not used for the read-across as they are further away from the target chemical on the X-axis. By default the five nearest analogues with respect to logKow are used in read-across calculation. (This is optional. See
Calculation options/Prediction approach options). The red dot (3) represents the estimated result for the target chemical based on the read-across from the analogues. (Figure 13)
Figure 13
• Trend analysis: The blue dots (1) on the graph represent the experimental data available for the chemicals in the category which are used in regression equation. The red dot (2) represents the calculated data for the target chemical
based on regression calculation from the analogues in the category. The observed value for the target chemical, if available, is colored in orange (3). (Figure 14)
Figure 14
• QSAR model – The blue dots (1) on the graph represent the experimental data for analogues chemicals, the little blue triangles (2) represent the observed data for training set chemicals of the model (if available), the little blue square
(3) represent the observed value for test chemicals (if available). (Figure 15)
Figure 15
More details for color legend are displayed below. (Figure 16)
Figure 16
The Chart legend could be evoked from the Information submenu in the menus portion of dap-filling window and clicking on the Show legend. (See section Information/Show legend)
3. Menus of the data gap filling
On the Figure 17 menus of the data gap filling are shown.
Figure 17 – menus of data gap filling
Below is a short description of each menu.
3.1. Select/filter data (Figure 18)
Figure 18 – Select/filter data options
• Subcategorize - Sub-categorization is one of the most powerful tools available to the user. It provides the features to refine the broader category into a more consistent group, more pertinent set of chemicals for the user to derive a
prediction from using the chemical’s properties. (Figure 19)
Figure 19
In the particular example illustrated for CAS 122043 (Figure 20), all results of the analogues are positive. The same sensitizing potential is therefore also predicted for the target chemical. By default, the Toolbox averages the result of the 5 “nearest” analogues with respect to log Kow (as defined by the X-axis descriptor) to estimate the result for the target chemical. The user can then verify the mechanistic robustness of the analogue approach.
Figure 20
This can be verified by opening Select / Filter Data menu and re-profiling the list of identified analogues by clicking on Subcategorize (1) and choosing Protein Binding by OASIS (2). The properties of the analogues (3) are then compared with the properties of the target chemical (4). In this particular example there is one analogue which has different protein binding mechanism than of the target chemical. It is colored in green on the graph and in data matrix (5)
(Figure 21)
Figure 23
Note: Keep in mind that all Databases and Inventories are not metabolized.
The right panel of Subcategorization window consists of two parts Target and Analogues. Both panels include profiling results of target and analogues. The Target panel includes profiling results (1) across selected profiling method (2) for the Target and its metabolites if available (3). (Figure 24)
Figure 32
This has to be done for the other query chemical too (1). Then the user has to select the desired calculation settings (2) (Figure 33)
Figure 33
• Mark chemical by WS (Figure 34)
Figure 38
This feature provides the user with capabilities to remove some of the data-points based on their metadata. After selecting Filter by test condition button (1) a new Data Filter (2) window appears. Here the user could select the desired metadata field (3) used it for filtering and remove the dissimilar data (4). (Figure 39)
Note: this functionality removes experimental data for a given chemical.
Figure 43
• Remove marked chemicals/points (Figure 44)
Figure 44
This function (1) removes marked (colored in green.) chemical/data-points (2) (Figure 45)
Figure 45
• Clear existing marks (Figure 46)
Figure 46
This function clears the markings of chemicals/data-points.
3.2. Selection navigation (Figure 47)
Figure 47
This functionality applies the following actions:
• Go back – undo one change.
• Go forward – redo one change.
• Go to first – go to initial state.
• Go to last – go to final state.
3.3. Gap filling approaches (Figure 48)
Figure 48
This functionality allows the user to switch between gap filling modules.
• Read-across – switch to read-across.
• Trend analysis – switch to trend analysis.
3.4. Descriptors/Data (Figure 49)
Figure 53
• Save domain as category –save the domain as a category. You will be prompted to select a profiler to which to add the category. If you plan to use this feature you would need to create a custom profiler to serve as storage for the categories. This profiler/grouping method could be used for the categorization purposes.
• Save JRC XML QMRF – save model as a XML file.
• Calculate Q2 – The program allows manual calculation of Q2 parameter for categories containing more than 50 analogues. This is due to the calculation of Q2 for categories consisting of more than 50 analogues being quite time consuming and is thus not performed automatically.
3.6. Calculation option (Figure 54)
Figure 54
• Data usage – this sets the way the Toolbox handles multiple data-points per single chemical. (Figure 55)
Figure 55
• Prediction approach options – for read-across sets the way the prediction is approximated – minimal, maximal, average, median, lower median, higher median, mode, lowest mode and highest mode. For trend analysis it sets the approximation type – averaging, linear and quadratic. (Figure 56)
Figure 59
• Show all members of chemical sets – this function was developed for sets of chemicals. Clicking this button will show/hide the members of chemical sets. All chemicals for a given set are visualized (e.g all tautomers in a tautomeric set) (Figure 60) (e.g light blue colored dots represent three tautomers for a given chemical).
Figure 60
• Show confidence range – show/hide confidence range in the prediction panel. The inside range shows confidence range of regression equation (1), while the outside range shows confidence range of individual prediction(2) (Figure 61)
Figure 61
• Show intercorrelations – shows the inter-correlations panel. When this button is clicked the user has to select the descriptors (1) for X and Y (2) axis used in the correlation in the Intercorr. window. (Figure 62)
Figure 62
3.8. Information (Figure 63)
Figure 63
• Focused details – show additional details about the selected data-point’s chemical:
This window includes Chem ID information (1), structure of chemical (2), panel with calculated descriptors (3), panel with profiling results across selected descriptors (4) and (5) panel with recalculated data.( Figure 64)
Figure 64
Double clicking over the Endpoint obs. Data (recalculated) will display a window with experimental data (Figure 65)
Figure 65
• Target details – show additional information about the target chemical.
• Differences to target – this shows the differences between the selected data-point (1) and target chemical with respect to all available Profiling methods. The profiling method(s) for which there are some differences are colored in orange (2) (Figure 66)
Figure 66
• All points within a region – sometimes there are chemicals with same logKow values in a category of two or more chemicals, presented as a dots, to be one behind the other. This function allows the user to see all the chemicals (dots) within one region. The user has to click All points within a region (1), then to drag the mouse (left mouse button) in order to specify the rectangular region (2). As a result a window with details for all chemicals appears (3), selecting a specific point number (4) displays information for the selected chemical (5). (Figure 67)
Figure 67
• Show legend – show/hide the chart’s legend.(1) (Figure 68)
Figure 74
If these checkboxes are ticked only the models related to the specific nodes of endpoint tree are available. For example if the node Ecotoxicity>>Actinopergyii (fish) (1) is selected and the two aforementioned boxes are ticked (2) only the ECOSAR (USEPA) (3) model is available in the relevant QSAR models panel (4). The other models that are associated with more specific endpoint nodes are positioned in the panel QSAR models in nodes below (5). (Figure 75)
Figure 75
The models placed in the panel QSAR models in nodes below will be available when the user select the node to which the model is assigned. For instance when the user expands the Actinopergyii (fish) node and then the Pimephales promelas (1) the models related to this fish will be available (2). (Figure 76)
Figure 76
3.2. Ranking of QSAR models
Comparison of results between models related to a given endpoint is possible using the ranking functionality. This could be done when the user clicks the Rank models button (1), then the Models ranking window appears (2). (Figure77)
Figure 77
Detailed information for Models ranking window is given on Figure 78 below:
Figure 78
Managing of visualization and ranking of fields in the table is possible using the popup menu’s Select Descriptor menu-item (1), then QSAR descriptors window appears (1) Figure 79
Figure 79
In the displayed QSAR descriptors panel the user could change visualization of fields with a double click on the cell of the current field and changing it to YES or NO. (1) For example Title field could be visible or not if YES or NO is set in the cell corresponding to this field. (Figure 80)
Figure 80
Reordering (Up or Down) and using of specific model field is possible when double click and change the current status of the field (2) (Figure 80)
When the settings of ranking are fixed, then the models are ranked in the Relevant QSAR library window For example ranked by Title (1) or ranked by availability in domain (2) (Figure 81)
Figure 81
3.3. Applying QSAR model
The model can be used to evaluate a category or a single chemical by applying it to all the chemicals in the category and analyzing the results. To apply the model simultaneously to all the chemicals:
• in the category, select the model (1), right-click on it and select “Predict Endpoint” (2) and “All chemicals” (3). (Figure 82)
• in the domain of the model, select the model (1), right-click on it and select “Predict Endpoint” (2) and “All chemicals in domain” (3). (Figure 83)
Predict All chemicals
Figure 82
Predict Chemicals in domain
Figure 84
Pop-up menus
• Rank models – this functionality is displayed in Ranking models section
• Sort by Date – QSAR are sorted by date when this is set
• Model About – some information for the QSAR model is provided (Figure 85)
Figure 85
• Model Options – this function is available for those QSAR models which use descriptors which can be calculated in different ways. For example if the user selects BCF (EPISUIT) (1) model and select Calculation options (2), then the pop-up menu appears where the user could chose one way for calculating BCF parameter (3) used further in calculation of BCF (EPISUIT) model. (Figure 86)
Figure 86
• Display Domain – this function visualizes the Domain (if available) for the selected QSAR model (Figure 87)
Figure 87
In this particular example the target chemical (3) belongs to the domain (2) of model M2-LC50-Pimephales promelas (1), because it fulfills the conditions (green ticked boundaries) of the model (4).
• Display tautomeric filter – functionality available when tautomeric filter is applied on a set of chemicals
• Apply tautomers filter – functionality available when a QSAR model derived for tautomeric set of chemicals
• Display QMRF – it display QMRF file for selected model (if available). (Figure 88)
Figure 88
• Display training set chemicals – it displays a separate window with the chemicals in the training set of the selected model (if available) with their available experimental data. (Figure 89)
Figure 89
• Display Test set chemical – it displays test set chemicals for a selected QSAR model (if available). (Figure 90)
Figure 90
• Delete Model – it deletes a custom QSAR model. Only custom QSAR models are allowed to be deleted
• Delete Predictions – it deletes predictions for selected QSAR model
• Check Calculations – it displays a window with comparison table for Regression Statistic and Regression Equations of original model and recalculated model. (Figure 91)
Figure 91
• Rebuild – it rebuilds the selected QSAR model. (Figure 92)
Figure 92
3.4. Creating a new QSAR model
The Toolbox allows the user to create a new custom QSAR. The next sequence of snapshots demonstrates building a QSAR model for predicting acute toxicity to Tetrahymena pyriformis of aldehydes. For the purpose of this study, a category of analogues should be available. In our case study we are investigating target chemical with CAS 66-25-1, the category used in defining a new QSAR is Aldehydes by US-EPA, the investigated endpoint is IGC 50 48 h, Tetrahymena pyriformis. (1) The user has to click the Create New QSAR button (2) and finally click the Apply (3) button. (Figure 93)
Figure 93
After gap filling module is displayed, the descriptor on the X-axis has to be activated in order to build the regression (1). Log Kow is selected but it is not active, the user has to manually activate it. (2). (Figure 94)
Figure 94
After activating the chemical descriptor used in the equation the user has to build the model (1). (Figure 95)
Figure 95
After clicking the Build button (1) then the model is built and all analogues (dots) are colored in purple (2). (Figure 96)
Figure 96
Below are additional options available for QSAR modes ONLY:
• Mark chemicals out of domain
• Show analogues
• Show training set
• Show test set
• Build model
• Restore model
Select/Filter data
Mark chemicals out of domain – it marks those chemicals which are out of the domain of the model
Model QSAR
Show analogues – shows analogue chemicals included in the model
Show training set – shows chemicals included in the training set of the model if available
Show test set – shows chemicals in the test set of the model if available
Build model –builds the model
Restore model –restores the model
3.5. Application of QSAR model to defined category of chemicals
Toolbox allows the user to apply the selected QSAR model to chemicals presented on data matrix. First of all the user has to click on the cell with corresponding QSAR model (1), select the QSAR model (2) and click the Apply (3) button. (Figure 97)
Figure 97
In this particular case study there are 68 analogues with 68 experimental IGC 50 data, so in the gap filling 68 analogues will be included in the category. After clicking the apply button the Possible data inconsistency will appear. Note that it will look for the scale of the applied model. In this particular case study M2 model requires log (mol/l) (1), if there is a conversion for the experimental data from mg/l (2) to log (mol/l), then all 68 data will be allowed into the gap filling. (Figure 98)
Figure 98
Now the QSAR model is applied to the 68 chemicals from data matrix. (Figure 99)
Figure 99
Now the user has to build the regression equation (1). (Figure 100)
Figure 100
Now the user is allowed to refine the category, applying the subcategorization procedure.
F. Data Gap Filling approaches using different modes of handling chemical structures
Two different modes for handling of chemical sets are defined:
• Individual Component Mode - The target chemical, its metabolites or mixture constituents are analyzed as individual structures
• Set mode - The chemical and its tautomers are handled as a set of structures. (Figure 101)
Figure 101
Three methodologies for estimating toxicity of set of chemicals are developed:
• Independent mode (Dissimilar action)
• Similar mode (Dose concentration)
• Specific models
Both concepts (independent action and dose/concentration addition) are based on the assumption that chemicals in a mixture do not influence each other’s toxicity, i.e. they do not interact with each other at the biological target site. Such chemicals can either elicit similar responses by a common or similar mode of action, or they act independently and may have different endpoints and/or different target organs. Both concepts have been suggested as default approaches in regulatory risk assessment of chemical mixtures.
Independent action (response addition, effects addition) occurs if chemicals act independently from each other, usually through different modes of action that do not influence each other.
Dose/concentration addition (similar action, similar joint action) occurs if chemicals in a mixture act by the same mechanism/mode of action, and differ only in their potencies. In principle, doses or concentrations of the single components are added after being multiplied by a scaling factor that accounts for differences in the potency of the individual substances. The mixture dose/concentration (Dmix) is the sum of the adjusted doses/concentrations (aDi) of the individual components Di:
The effect of a mixture of similarly acting compounds is equivalent to the effects of the sum of the potency-corrected (adjusted) doses/concentrations of each compound.
Specific models – This methodology has the aim to use QSAR models developed on a basis of set of chemicals (mixtures) for purposes of mixture toxicity prediction. This section is under development.
Based these methodologies for handling set of chemicals, different ways for handling of tautomers, mixtures and metabolites are developed.
Note: In case the gap filling is entered with set of chemicals with undefined quantities of the components equimolar quantities for all components are assumed for the gap filling calculations.
1. Tautomeric set prediction
In comparison to TB 2.3 where tautomers are not handled, TB 3.0 handles tautomers as part of structure multiplication of parent chemical and all tautomers of the target chemical are analyzed in a single package (set mode)
Below is illustrated a procedure of data gap filling using tautomeric set for prediction of:
1.1. Skin sensitization endpoint for chemical with CAS 577-71-9
1.1.2. Enter chemical via CAS (1) and select tautomeric set chemical (2). Then software searches the selected databases for tautomeric set of the entered chemical. (Databases are already tautomerized, calculated and profiled in tautomeric mode). Click Ok button. (Figure 102)
Figure 106
1.1.6. Read-across is applied (Figure 107)
Figure 107
The tautomeric sets of analogs have the same distribution of Protein binding alerts as the target set (1), so the prediction could be accepted (2). (Figure108)
Figure 108
1.2. Aquatic toxicity (LC 50, 96h, Pimephales promelas) endpoint for chemical with CAS 89-62-3
1.2.1. Input CAS 89-62-3 (1), select tautomeric set (2), click OK (3). (Figure 109)
Figure 111
1.2.4. Apply trend analysis to the defined category for LC 50 96h, Pimephales promelas (1). (Figure 112)
Figure 112
1.2.5. There is a new functionality in TB for visualization of all tautomers in a tautomeric set. Open Visual options (1) and select Show all members of chemical sets (2). (Figure 113)
Figure 113
All members of the tautomeric set appear (1). (Figure 114)
Figure 114
1.2.6. The following subcategorization procedure is applied to refine the category
• Remove chemicals having LC 50 more than their WS using WS fragments
• US-EPA New Chemical categories
• Chemical elements
• ECOSAR The next snapshot illustrates a prediction of LC 50 after subcategorization procedure – 56.3 mg/l. (1). (Figure 115)
Figure 115
In the TB 3.0 a new functionality Apply as filter is developed (1). (Figure 116)
Figure 117
Figure 118
1.2.8. Finally accept the prediction (1). (Figure 119)
Figure 119
Recommendations
1. For toxic effects conditioned by cell signaling networks (such as skin sensitization, genetic toxicity, etc.) highly reactive tautomers appear to be responsible for the observed toxicity.
Recommendation: use the complete tautomeric representations of the chemicals
2. For toxic effects conditioned by less specific interactions (such as mortality, growth inhibition, immobilization, etc.) the stable tautomeric forms appear to be the dominant toxicants. Databases and Inventories usually contain the most stable tautomeric form.
Recommendation: use the most stable tautomers for representation of the chemicals
2. Quantitative mixtures toxicity prediction
Defined mixtures are handled as part of the structure multiplication of parent chemical. Three new options for prediction of mixtures are available based on the mode of action of the constituents:
• Acting Independently (with different mode of action)
• Acting Similarly (with same mode of action)
• Acting Specifically (specific models for predicting toxicity of mixtures could be applied)
Below is illustrated a procedure for assessing mixture toxicity investigation for the following two endpoints:
• Aquatic toxicity
• Skin sensitization
The following mixture with defined quantities is used in the predicting of the two endpoints mention above. (Figure 120)
Figure 120
2.1. Predicting aquatic toxicity of mixtures:
2.1.1. Input of mixture. There is a feature developed in TB 3.0 to define quanitities of component of the mixture. Once the constituents of the mixtures is pasted or drawn (1) in the 2D editor, a specific button allowing input of quantities appears (2). The quantities (3) of each of the components along with units (4) are added manually. (Figure 121)
Figure 121
After defining the quantities they will appear in the panel with molecular structure.
2.1.2. Profiling components of the mixture – the user has to switch to Individual mode (1) and select relevant profilers (2). In our case study we are selecting ECOSAR, MOA of action, US-EPA. Then profile the mixture. As it can be seen, all components have same mode (3). (Figure 122)
Figure 122
2.1.3. Gathering experimental data – In this particular case related to aquatic toxicity, the user has to select Aquatic databases (1) and to gather experimental data (2). Available experimental data appears on the datmatrix (3). (Figure 123)
Figure 123
2.1.4. Gap filling approach using Similar mode of action. In this particular case study Similar mode is applied in for calculation purposes based on the fact that the investigated mixture has defined quantities and all component have same mode of action. The user has to click on a cell related to LC 50, 96h, Pimephales promelas (1) for the mixture, select Similar mode (2) and click the Apply button (3). (Figure 124)
Figure 124
The prediction result (1) accounts for quantities of each component and uses dose concentration calculation (2) for prediction of LC 50. (Figure 125)
Figure 126
Below is illustrated a procedure for Read-across prediction for one of the components:
2.2.1. Focus on constituent without experimental data (1). It then appears in a new data matrix (2). (Figure 127)
Figure 129
Almost all analogs have been found to be positive. Predicted SS effect of the target is positive (1). (Figure 130)
Figure 130
Based on the prediction for the constituent (1) without experimental data and two other constituents with experimental data (2) the read-across prediction for mixture could be performed. (Figure 131)
Figure 131
2.2.4. Read across is applied for the mixture (assuming Independent Mode of Action) (1). “Maximal” approximation type (2) is set by default for read across of categorical endpoints. (Figure 132)
Figure 132
Note: TB 3.0 uses mixtures with defined quantities. In case there quantities of the components are not defined then they are concerned as equimolar.
3. Prediction accounting for metabolism
In TB 2.3 metabolism could be used in Profiling and Subcategorization only, while in TB 3.0 generated metabolites can be used as representatives of the target chemical. Data Gap Filling can be applied to selected metabolites and predictions transferred to parent chemical.
Also the metabolites along with the target could be assumed a set of chemicals and predictions to be applied for the set.
Below is an illustration of a read-across prediction for skin sensitization applied to a selected metabolite for chemical “trans-2,cis-6-nonadienol” (CAS# 28069-72-9).
In the scheme bellow there are no alerts for the parent chemical so an investigation of the metabolites of target chemical can be performed:
Our target chemical has no protein binding alert; however it has six metabolites which have some alerting group responsible for protein interaction. Gap filling procedure applied to a selected metabolite and transfer of prediction to a parent chemical:
3.1. Entering target (CAS# 28069-72-9) chemical by CAS number. (Figure 133)
Figure 133
3.2. Multiplication target chemical via skin metabolism simulator – the user has to right click over the chemical (1) and select Multiplication>> Metabolism/Transformations>>Skin metabolism simulator (2) (Figure 134)
Figure 134
Then all the metabolites appear in tree like form. (Figure 135)
Figure 135
3.3. In the Profiling step the user could apply Protein binding profilers being relevant to skin sensitization endpoint to all metabolites as a package or in the individual mode (Figure 136)
Figure 136
3.4. Next step in the workflow is to gather experimental data – Skin sensitization is selected (1), click Gather (2). As it can be seen there is a positive experimental data for the parent chemical. (Figure 137)
Figure 140
3.8. The following subcategorization procedure is applied:
o Protein binding by OASIS
o Protein binding by OECD
o Protein binding potency
Below is the read-across analysis after subcategorization procedure (1). (Figure 141)
Figure 141
3.9. The user has to accept prediction (1) and return to data matrix (2) in order to continue with transferring the prediction of metabolite to the target chemical. (Figure 142)
Figure 142
Before returning to datamatrix a series of messages appear.
Note: The appearance of these messages is optional and is governed by the General options/Reports section.
The first message informs the user that the model is still not saved (1) and invites the user to save the model. If the Yes button is clicked then an Edit model window appears and invites the user to fill in the fields (2). If the No button is clicked then the software will not save the model. (Figure 143)
Figure 143
The next message asks the user to specify the profilers relevant to the investigated endpoint. These selected profilers will appear in the report. If the Yes button is clicked the window with all profilers appears where the user can select the desired profilers (2). By default there are some profilers selected (this is optional and default options could be changed in General Options/Report). If the No button is clicked then the profilers selected by default will appear in the report. (Figure 144)
Figure 144
The next message (1) asks the user to if he/she wants to collect additional data for the analogues from data matrix for reporting purposes. If the Yes button (2) is clicked a window with Endpoint tree nodes will appear and the user could specify the node for which the experimental data will be reported. If the No button is clicked the default experimental sets by will be reported (This is optional and could be changed in Options/Reports). (Figure 145)
Figure 145
3.10. Finally click Return to datamatrix.
3.11. In order to transfer the prediction of metabolite to the parent chemical the user has to return to the parent chemical matrix. Return to Input (1) and click over the first node in the current document with parent chemical (2). Now the datamatrix of the parent chemical is displayed. (Figure 146)
Figure 147
3.13. Accept the prediction. Now the prediction of metabolite is transferred to the parent chemical (1). (Figure 148)
Figure 148
G. Right click menus
Right click over the cell with accepted values provides the user several options. (Figure 149)
Figure 151
Edit prediction info – allows the user to fill in or edit fields which appears in the Toolbox report
Report – generate report for selected prediction if there are multiple available.
IUCLID5 – export prediction via i5z files or via Web Services (see IUCLID export)