• Sonuç bulunamadı

Outline: microarray data analysis

N/A
N/A
Protected

Academic year: 2021

Share "Outline: microarray data analysis"

Copied!
39
0
0

Yükleniyor.... (view fulltext now)

Tam metin

(1)

Clustering

(2)

Outline: microarray data analysis

Gene expression Microarrays

Preprocessing

normalization scatter plots Inferential statistics

t-test ANOVA

Exploratory (descriptive) statistics distances

clustering

principal components analysis (PCA)

(3)

Descriptive statistics

Microarray data are highly dimensional: there are

many thousands of measurements made from a small number of samples.

Descriptive (exploratory) statistics help you to find meaningful patterns in the data.

A first step is to arrange the data in a matrix.

Next, use a distance metric to define the relatedness of the different data points. Two commonly used

distance metrics are:

-- Euclidean distance

-- Pearson coefficient of correlation

Page 203

(4)

What is a cluster?

A cluster is a group that has homogeneity (internal

cohesion) and separation (external isolation). The

relationships between objects being studied are

assessed by similarity or dissimilarity measures.

(5)

Background

 Clustering is one of the most important unsupervised learning processes that organizing objects into groups whose members are similar in some way.

 Clustering finds structures in a collection of unlabeled data.

 A cluster is a collection of objects which are similar between them and are dissimilar to the objects

belonging to other clusters.

(6)
(7)

Motivation I

• Microarray data quality checking

– Does replicates cluster together?

– Does similar conditions, time points, tissue

types cluster together?

(8)

Motivation II

• Cluster genes  Prediction of functions of

unknown genes by known ones

(9)

Functional significant gene clusters

Two-way clustering

Gene clusters

Sample clusters

(10)

Motivation II

• Cluster genes  Prediction of functions of unknown genes by known ones

• Cluster samples  Discover clinical

characteristics (e.g. survival, marker

status) shared by samples.

(11)

Bhattacharjee et al. (2001) Human lung carcinomas mRNA expression

profiling reveals distinct adenocarcinoma

subclasses.

Proc. Natl. Acad. Sci.

USA, Vol. 98, 13790- 13795.

(12)
(13)
(14)
(15)
(16)
(17)
(18)

Calculate the similarity between all possible

combinations of two profiles

Two most similar clusters are grouped together to form

a new cluster

Calculate the similarity between the new cluster and

all remaining clusters.

Hierarchical Clustering

Keys

• Similarity

• Clustering

(19)
(20)
(21)
(22)
(23)
(24)
(25)
(26)
(27)
(28)
(29)
(30)
(31)
(32)
(33)
(34)
(35)
(36)
(37)
(38)
(39)

Referanslar

Benzer Belgeler

Mechanization and the so- called deskilling of many work processes, the continuing signifi­ cance of craft-based control and craft autonomy, the increas­ ing significance

Figure 14: East- and southeast-oriented rooms with OPright were significantly different than the other rooms when the areas of sunlight patches on the total surfaces analyzed at

Following that, Fallout 3 will be presented within the suggested framework of Aarseth composed of three parts, game-world, gameplay & game-structure, while comparing

I don’t want to use socio-cultural, since the term in Turkish gecekondu hterature in order to explain the social actm ties o f the migrant population in the city center or work

The MTT test did not indicate a significant growth inhibition in ZF4 cells following rapamycin treatment, however, rapamycin was observed to significantly downregulate zebrafish

• Supervised network paradigms: perceptron, of weights for each input and assuming zero linear network, backpropagation, Levenberg–.. Marquardt (LM) and reduced LM algo- Matlab codes

In Specification 2.1, we do not have statistically significant evidence that either winnings, ties or losses at home or in displacement have an explanatory power for the

The paper is organized as follows. In Section 2, we give a brief account of Sasaki an manifolds. In Section 3, we study globally 4>-quasiconformally sym- metric Sasakian