The following examples describe different approaches for usage of data and making predictions in categorical scales. They are prepared for imaginary scale containing four possible values and three chemicals having more than one observed value for the selected endpoint.
The first picture represents the real observed values for the three chemicals.
1. DataUsage = All, all points will be taken into the
calculations.
Fig. 1 The real observed values for chemicals (the same picture for using all data points)
The next pictures represent the recalculated values for the three chemicals. The real observed values are blank and the recalculated values are blue.
2. DataUsage = Minimal, one point per chemical is
given.
Fig. 2 The recalculated values for chemicals (when
using minimal value for each chemical)
3) DataUsage = Maximal, one point per chemical is given.
Fig. 3 The recalculated values for chemicals (when using maximal value for each chemical)
4) DataUsage = Median(s), Chemicals 1 and 2 have two medians; Chemical 3 has one median
(value 2 is not taken into account here).
Fig. 4 The recalculated values for chemicals (when using median values for each chemical)
5) DataUsage = Lower median, one point per chemical is given.
Fig. 5 The recalculated values for chemicals (when using lower median value for each chemical)
6) DataUsage = Higher median, one point per chemical is given.
Fig. 6 The recalculated values for chemicals (when using higher median value for each chemical)
7) DataUsage=Mode(s), Chemicals 2 and 3 have two modes; Chemical 1 has four
modes.
Fig. 7 The recalculated values for chemicals (when using mode values for each chemical)
8) DataUsage = Lowest mode, one point per chemical is given.
Fig. 8 The recalculated values for chemicals (when using lowest mode value for each chemical)
9) DataUsage = Highest mode, one point per chemical is given.
Fig. 9 The recalculated values for chemicals (when using highest mode value for each chemical)
Making predictions
Let’s assume that Chemical 2 and 3 are the neighbors that determine the prediction. The various cases shown above will look as follows:
1) DataUsage = All, the prediction value is:
- Value 4, when the approximation type is “Minimal”
- Value 1, when the approximation type is “Maximal”
- No value, when the approximation type is “Median” – Value2 and Value 3 are both medians, so the system cannot make a decision automatically
- Value 3, when the approximation type is “Lower median”
- Value 2, when the approximation type is “Higher median”
- Value 3,
when the approximation type is “Mode”, “Lowest mode” or “Highest mode” – 7
neighbor points are available for this value; only one mode value is available
in this case, so the last three approximation types give the same prediction
value.
Fig. 10 The prediction values when using all data points
2) DataUsage=Minimal, the prediction value is:
- Value 4, for all approximation types.
Fig. 11 The prediction value when using minimal value for each chemical
3) DataUsage = Maximal, the prediction value is:
- Value 1,
for all approximation types.
Fig. 12 The prediction value when using maximal value for each chemical
4) DataUsage=Median(s), the prediction value is:
- Value 3, when the approximation type is “Minimal”
- Value 2, when the approximation type is “Maximal”
- No value, when the approximation type is “Median” – Value2 and Value 3 are both medians, so the system cannot make a decision automatically
- Value 3, when the approximation type is “Lower median”
- Value 2, when the approximation type is “Higher median”
-Value 3,
when the approximation type is “Mode”, “Lowest mode” or “Highest mode” – 2
neighbor points are available for this value; only one mode value is available
in this case, so the last three approximation types give the same prediction
value.
Fig. 13 The prediction values when using median values for each chemical
5) DataUsage=Lower median, the prediction value is:
- Value 3, for all approximation types.
Fig. 14 The prediction value when using lower median value for each chemical
6) DataUsage=Higher median, the prediction value is:
- Value 3, when the approximation type is “Minimal”
- Value 2, when the approximation type is “Maximal”
- No value, when the approximation type is “Median” – Value2 and Value 3 are both medians, so the system cannot make a decision automatically
- Value 3, when the approximation type is “Lower median”
- Value 2, when the approximation type is “Higher median”
- No value, when the approximation type is “Mode” – Value2 and Value 3 are both modes, so the system cannot make a decision automatically
- Value 3, when the approximation type is “Lowest mode”
- Value 2, when the approximation type is “Highest mode”
Fig. 15 The prediction value when using higher median value for each chemical
7) DataUsage=Mode(s), the prediction value is:
- Value 4, when the approximation type is “Minimal”
- Value 1, when the approximation type is “Maximal”
- Value 3, when the approximation type is “Median”, “Lower median” and “Higher median” – only one median value is available in this case, so these three approximation types give the same prediction value
-Value 3, when the approximation type is “Mode”, “Lowest mode” or “Highest mode” – 2 neighbor points are available for this value; only one mode value is available in this case, so the last three approximation types give the
same prediction
value.
Fig. 16 The prediction values when using mode values for each chemical
8) DataUsage=Lowest mode, the prediction value is:
- Value 4, when the approximation type is “Minimal”
- Value 3, when the approximation type is “Maximal”
- No value, when the approximation type is “Median” – Value 3 and Value 4 are both medians, so the system cannot make a decision automatically
- Value 4, when the approximation type is “Lower median”
- Value 3, when the approximation type is “Higher median”
- No value, when the approximation type is “Mode” – Value 3 and Value 4 are both modes, so the system cannot make a decision automatically
- Value 4, when the approximation type is “Lowest mode”
- Value 3,
when the approximation type is “Highest mode”.
Fig. 17 The prediction value when using lowest mode value for each chemical
9) DataUsage=Highest mode, the prediction value is:
- Value 3, when the approximation type is “Minimal”
- Value 1, when the approximation type is “Maximal”
- No value, when the approximation type is “Median” – Value 1 and Value3 are both medians (Value 2 is not taken into account here), so the system cannot make a decision automatically
- Value 3, when the approximation type is “Lower median”
- Value 1, when the approximation type is “Higher median”
- No value, when the approximation type is “Mode” – Value 1 and Value 3 are both modes, so the system cannot make a decision automatically
- Value 3, when the approximation type is “Lowest mode”
- Value 1, when the approximation type is “Highest mode”.
Fig. 18 The prediction value when using highest mode value for each chemical