Skip to content

Commit 133fdbf

Browse files
committed
small cleanups
1 parent 41f4297 commit 133fdbf

File tree

1 file changed

+3
-4
lines changed

1 file changed

+3
-4
lines changed

mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ import org.apache.spark.util.random.{SamplingUtils, XORShiftRandom}
5151
* findSplits() method during initialization, after which each continuous feature becomes
5252
* an ordered discretized feature with at most maxBins possible values.
5353
*
54-
* The main loop in the algorithm operates on a queue of nodes (nodeQueue). These nodes
54+
* The main loop in the algorithm operates on a queue of nodes (nodeStack). These nodes
5555
* lie at the periphery of the tree being trained. If multiple trees are being trained at once,
5656
* then this queue contains nodes from all of them. Each iteration works roughly as follows:
5757
* On the master node:
@@ -162,11 +162,10 @@ private[spark] object RandomForest extends Logging {
162162
}
163163

164164
/*
165-
FILO queue of nodes to train: (treeIndex, node)
166-
We make this FILO by always inserting nodes by appending (+=) and removing with dropRight.
165+
Stack of nodes to train: (treeIndex, node)
167166
The reason this is FILO is that we train many trees at once, but we want to focus on
168167
completing trees, rather than training all simultaneously. If we are splitting nodes from
169-
1 tree, then the new nodes to split will be put at the end of this list, so we will continue
168+
1 tree, then the new nodes to split will be put at the top of this stack, so we will continue
170169
training the same tree in the next iteration. This focus allows us to send fewer trees to
171170
workers on each iteration; see topNodesForGroup below.
172171
*/

0 commit comments

Comments
 (0)