I have created a random forest model with H2o and exported as PMML. Below is a snippet showing the first decision tree with nodes and their IDs.
[...]
<TreeModel functionName="regression" missingValueStrategy="defaultChild">
<MiningSchema>
<MiningField name="Sepal.Length"/>
<MiningField name="Petal.Width"/>
</MiningSchema>
<Node id="1" defaultChild="2">
<True/>
<Node id="2" defaultChild="3">
<SimplePredicate field="Sepal.Length" operator="lessThan" value="5.450927734375"/>
<Node id="3" score="1.0">
<SimplePredicate field="Petal.Width" operator="lessThan" value="0.800585925579071"/>
</Node>
<Node id="4" score="0.0">
<SimplePredicate field="Petal.Width" operator="greaterOrEqual" value="0.800585925579071"/>
</Node>
</Node>
<Node id="5" defaultChild="7">
<SimplePredicate field="Sepal.Length" operator="greaterOrEqual" value="5.450927734375"/>
<Node id="6" score="1.0">
<SimplePredicate field="Petal.Width" operator="lessThan" value="0.601367175579071"/>
</Node>
<Node id="7" score="0.0">
<SimplePredicate field="Petal.Width" operator="greaterOrEqual" value="0.601367175579071"/>
</Node>
</Node>
</Node>
</TreeModel>
[...]
However, when comparing the node IDs with model info it seems the values deviate. Below is a short summary The corresponding model (generated with H2o) shows deviating node ids
tree node pred feat val dt.left_children dt.right_children
1: 1 0 0.0000000 Sepal.Length 5.4509277 1 2
2: 1 1 0.8823530 Petal.Width 0.8005859 3 4
3: 1 2 0.1020408 Petal.Width 0.6013672 5 6
4: 1 3 1.0000000 <NA> NA -1 -1
5: 1 4 0.0000000 <NA> NA -1 -1
6: 1 5 1.0000000 <NA> NA -1 -1
7: 1 6 0.0000000 <NA> NA -1 -1
8: 2 0 0.0000000 Petal.Width 0.8005859 1 2
9: 2 1 1.0000000 <NA> NA -1 -1
10: 2 2 0.0000000 <NA> NA -1 -1
It appears the model uses BFS-based indexing while the PMML output has a DFS-based indexing.
QUESTION: how can we use XSLT to create new node IDs that match the table shown above?