Aim of the module “Endpoint”
In the module “Endpoint” the user can retrieve experimental results from the resident databases.
A: Endpoint panel
The main visualization of endpoint tree panel is shown on Figure 1. All databases are listed in panel 1 while the inventories are presented in panel 2. (Figure 1)
The Endpoint section includes two sub-sections:
The databases include chemicals and their experimental data, while the Inventories include chemicals ONLY.
Each database in Toolbox includes chemicals with measured data. Chemicals are presented with their chemical ID information such as CAS, Chemical names and SMILES if available. Measured data (labeled with “M” in datamatrix) could be “quantitative” or “qualitative” type. In this respect some possible conversions of data are developed (scale conversions) in order to save and use all available observed data. For more details about scale and scale conversions, see section Options/Unit/ Edit Scale.
Databases available in TB 3.0 are listed in Table 1.
Table 1. List with databases available in Toolbox 3.0
Based on the different types of experimental data a new database organization was developed for TB 3.0. The organization of databases follows the organization of predefined nodes of Endpoint tree (see section Endpoint tree in Data matrix, link will be available here).
Below are shown four basic sections according to which the databases are distributed:
• Physical Chemical Properties
• Environmental Fate and Transport
• Ecotoxicological Information
• Human Health Hazards
Note: If one database includes endpoints from more than one section, it will be multiplied. Such a database is ECHA CHEM, it appears in all sections. Visualization of distribution of all databases implemented in TB v3.0 is shown on Figure 2.
If one wants to extract data related to a specific endpoint, then he/she has to select the specific database (e.g. Aquatic OASIS) (1), then the parent node is marked(full mark if all databases are selected or a semi-mark if only some of the databases are selected) (2) and finally click Gather data (3). (Figure 3)
After clicking Gather (1) the measured data are being extracted from selected database (2) (in this case Aquatic OASIS) and then appears on the datamatrix, labeled with “M” (3). (Figure 4)
Details for buttons placed above the databases are given below. (Figure 5)
1. Select All – selects all databases. Click Select All button (1), then all databases are selected (2). (Figure 6)
2. Unselect All – unselects all databases. Click Unselect All button (1), then all databases are unselected (2). (Figure 7)
3. Invert - this inverts selection inverts last performed selection of databases. For example if the user selects Aquatic OASIS (1) and click Invert (2) then the software selects all other databases (3) and deselects Aquatic OASIS. (Figure 8)
4. About – shows short description of highlighted database. Highlight the desired database (1); click button About (2), then Short description appears (3). (Figure 9)
The available inventories in Toolbox 3.0 are listed in Table 2. The databases include chemicals with available experimental data, while the Inventories include chemicals ONLY. Chemicals in the inventories are presented with CAS, Chemical name and SMILES
Table 2. List with databases available in Toolbox 3.0
Same options are available to select, unselect or invert select as in databases are available. (Figure 10)
1. Select All – this selects all inventories
2. Unselect All – this unselects all inventories
3. Invert – inverts the last performed selection of inventories
4. About – displays short description for highlighted inventory. The user has to highlight inventory (1), then click the About button (2) and then the Short description window will appear (3) (Figure 11)
D. Gather data
Data gathering can be executed in one of two basic ways:
• Collecting all data for all endpoints: The user has to click on Select all button (1), then all databases are selected (2), secondly he/she has to click Gather (3) button. All available measured data will be extracted for chemical(s) loaded on data matrix (4). (Figure 12). Before loading of measured data for chemical (s) on data matrix two additional windows appears: Read data and Repeated values. For details see subsections Read data and Repeated values. This process however could be extremely time consuming. A more effective approach will be to narrow down the required databases and endpoints before the execution of the gather data query.
• on a more narrowly defined basis (e.g., collecting data for a single or limited number of endpoints): select databases relevant with examined endpoint. The user has to select database(s) related with investigated endpoints (1), secondly he/she has to click Gather (2) button. All available measured data extracted from selected database(s) for chemical(s) loaded on data will appears (3). (Figure 13). Before loading of measured data for chemical (s) on data matrix two additional windows appears: Read data and Repeated values. For details see subsections Read data and Repeated values.
Read data window
After clicking Gather data button a window Read data appears (Figure 14)
Details of Read data window.
1. All endpoints – this means that the software will read and extract all data-points from database(s) and endpoint paths selected in the main window (Figure 15). For example, if Aquatic US-EPA ECOTOX database (1) is selected and the user selects All endpoints (2), then all data-points associated to all checked endpoint paths(namely Environmental Fate and Ecotoxicological Information) are extracted from the selected database (3).(Figure 15)
2. Choose – this function allows the user to further specify which endpoint paths will be queried for the selected database(s). If the user selects Aquatic US-EPA ECOTOX (1) and Terrestrial US-EPA ECOTOX (1), and then clicks Gather (2) he will be presented with the Read data dialog. From it the user could select Choose (3) and specify the desired nodes. As a result of this selection only specific data are extracted. In our case only aquatic toxicity data is extracted (4) while the terrestrial and sediment are disregarded (5). (Figure 16)
3. From Tautomers – this option allows the user to gather available experimental data for a chemical, taking into account experimental data for its tautomeric forms. The example bellow shows the extracted data from all available databases (1) for chemical with CAS 50442 when this option is not selected (1). As it can be seen only Human Health Hazards data is available. (Figure 17)
Figure 18 shows data extracted (1) and loaded (3) for a chemical with CAS 50442 for all tautomers (2).
With a right click over the chemical (1) and selecting Expand by CAS (2) from the popup menu the user could see all tautomers and available experimental data (3). (More details about MCAS in the Data matrix subsection in the Chemical input section). (Figure 19)
After specifying the paths in the Read data panel the application queries the database repository and, if needed, displays the Repeating values dialog:
Due to overlap between data in the Toolbox databases for intersecting chemicals the same data may be found simultaneously. Data is grouped based on the chemical ID (SMILES/CAS), value and endpoint tree position. These groups are displayed and the user has the opportunity to pick and choose which data-points to use. This overlapping is illustrated in Repeating values window.
Details of Repeating values window is shown on Figure 20.
1 – Short information for number of repeating values
2 – Detailed information for data points for a given chemical, displayed on separate rows.
3 – Column with same measured data for a given endpoint (in this case IGC 50) for a chemical
4 – Column with different metadata information.
5 - Select one – this selects only one value for a given chemical
6 – OK button – confirming the selected action
7 – Managing buttons
The user is given the opportunity to use the Single one button (5), which leaves only one datapoint from each group, and continue the workflow with single data values clicking OK button (6). However it is up to the user to perform due diligence and determine whether two datapoints are in fact the same from different source, or are two different datapoints that are of the same value by chance.
Notes: 1. The Toolbox databases include chemicals with experimental data, while the inventories include only chemicals without experimental data. 2. The databases are profiled and calculated in advance and the results are stored in a cache. This is done in order to accelerate the process of searching analogues with experimental data (defining categories).