Question

Description acceptable. The optimal choice for this quantity maximizes the Caliński–Harabasz (“kah-leen-skee hahr-ah-bash”) index. Chinese restaurant processes increment this quantity at each step with a probability that depends on the alpha parameter of the Dirichlet (“deer-ih-CLAY”) distribution. A good value for this quantity gives a high average silhouette score and evenly proportioned silhouette plots. Agglomerative approaches to building a (-5[1])hierarchy decrease (10[1])this quantity at each step, as shown on dendrograms. (-5[2])This quantity can be selected by finding (10[2])an “elbow” on plots of WCSS (10[2]-5[1])versus it. This quantity is selected as an input to, and lends its name to, an algorithm whose name ends in “plus plus,” (10[1]-5[1])which outperforms Lloyd’s algorithm and iteratively updates Voronoi cell centroids. For 10 points, name this quantity symbolized “k” (10[1])in an unsupervised learning (10[1])algorithm named (-5[1])for choosing “k means.” (10[1])■END■ (10[2]0[11])

ANSWER: number of clusters [accept number of partitions; accept number of tables after “Chinese” is read; accept number of centroids until “centroids” is read; accept number of means until “means” is read; accept descriptions of how many clusters or how many partitions; prompt on k until read] (WCSS is the within-cluster sum of squares. Lloyd’s algorithm is improved upon by k-means++.)
<Other Science>
= Average correct buzz position

Back to tossups