opt.minsize {cubt}R Documentation

Optimizes the minsize parameter by cross validation.

Description

Optimizes the minsize parameter by cross validation.

Usage

opt.minsize(datapp, ncl, dist = "im", nvc=10,ms=NULL)

Arguments

datapp

Data set

ncl

desired number of clusters

dist

distance to use for pruning, maybe "im" for mutual information, or "hamming", or "euclidian".

criterion

criterion used in for cubt, joining and prediction, default="entropy", maybe "anova"

prof=7

maximal depth of the initial tree

nvc

number of cross validations

ms

A vector of the values of minsize to be tested. If NULL then a sequence is generated depending on the sample size.

Details

Optimizes by cross validation the minsize parameter for CUBT when looking for ncl clusters. The criterion to optimize is the deviance of a cubt tree, that is the sum of leaves deviances. Deviance within a leave is the sum of entropies or ssq over variables.

Value

Returns a list containing the optimal value of minsize, and the minimum deviance obtained by cross validation.

Author(s)

Badih Ghattas


[Package cubt version 3.3 Index]