2010년 5회 통계세미나 개최 안내

 

통계연구소에서는 다음과 같이 통계 세미나를 개최하오니 많은 참여 바랍니다.

 

일시: 2010년 5월 26일(수) 오후 5시

장소: 고려대학교 정경관 501호

연사: Myunghee Lee

(Department of Statistics , Colorado State University)

 

“Clustering High Dimensional, Low Sample Size Data using Maximal Data Piling”

 

 

 

 

We present new hierarchical clustering method for high dimension, low sample size (HDLSS) data. The method utilizes the fact that each individual data vector accounts for exactly one dimension in the subspace generated by HDLSS data. The linkage that is used for measuring the distance between clusters is the orthogonal distance between affine subspaces generated by each cluster. The ideal implementation would be to consider all possible binary splits of data and choose the one that maximizes the distance in-between. Since this is not computationally feasible in general, however, we use singular value decomposition for its approximation. We provide theoretical justification of the method by studying high dimensional asymptotics. Also we obtain the probability distribution of the distance measure under the null hypothesis of no split, which we use to propose a criterion for determining the number of clusters. Simulation and real data analyses with microarray data show competitive clustering performance of the proposed method.

 

 

 

 

고려대학교 통계연구소