summarize-biomedical-papers-long-summary-or-tldr / examples /Mining the bladder cancer-associated genes by an integrated strategy for the construction and analysis of differential co-expression networks.txt
Blaise-g's picture
Update examples/Mining the bladder cancer-associated genes by an integrated strategy for the construction and analysis of differential co-expression networks.txt
e899a79
the morbidity of bladder cancer is in the first place among the cancers of urinary system. the bladder cancer cells can spread by breaking away from the original tumor. they can spread through the blood vessels to the liver, lungs and bones. however, its causes are not yet clear. the bladder cancer is a heterogeneous disease that shows both superficial and invasive growth. superficial tumors frequently recur and may progress to invasive growth. a part that warrants better treatment regimes bladder cancer is also a good model system to study tumor initiation and progression. to gain insights into the molecular biology of these processes, we performed gene expression analyses to get some important information about bladder cancer-associated genes. systems biology is an emerging approach applied to biomedical and biological scientific research. it is a biology-based inter-disciplinary field of study that focuses on complex interactions within biological systems, using a holistic approach to biological and biomedical research. network biology is a new way of representation and analysis of biological information processing, which understands life as a network. in fact, the network biology is a branch of the systems biology. differential co-expression network is one of biological networks. a gene co-expression network has emerged as a novel holistic approach for microarray analysis. stuart et al. and bergmannet al. separately constructed the gene co-expression network that connected genes whose expression profiles were similar across different organisms. a human network was analyzed by leeet al. with functional grouping and cluster analysis. van noort et al. demonstrated the small-world and scale-free architecture of the yeast co-expression network. they showed that functionally related genes are frequently co-expressed across organisms constituting conserved transcription modules. we wanted to explore transcriptional changes in terms of gene interactions rather than at the level of individual genes. in the end, we constructed two gene co-expression networks and sought to find cancer-induced changes in the network. the identification of co-expressed pairs in tumor and normal tissues led to the construction of two distinct networks that represent tumor and normal states, respectively. we expected that biological changes would be reflected in transcriptional changes, which could be identified by comparing the two co-expression networks. in the transcriptome analysis, differential co-expression analysis is emerging as a unique complement to traditional differential expression analysis. dcea investigates differences in gene interconnection by calculating the expression correlation changes of gene pairs between two conditions. the rationale behind differential co-expression analysis is that changes in gene co-expression patterns between two contrasting phenotypes provide hints regarding the disrupted regulatory relationships or affected regulatory subnetworks specific to the phenotype of interest. therefore, among the many growing directions of dcea, there is the so-called "differential regulation analysis", which integrates the transcription factor -to-target information to probe upstream regulatory events that account for the observed co-expression changes. recently, many researchers have integrated differential co-expression and differential expression concepts to propose a novel regulatory impact factor that can be used to prioritize disease-causative tfs. in addition, a lot of researchers have begun to perform differential co-expression analyses of micrornas. currently, some tools have been developed for differential expression analysis based on microarray, such as r packages "limma", "samr", "wgcna" and so on. in our study, we collected microarray datasets of bladder cancer from geo http://www.ncbi.nlm.nih.gov/geo/ to analyze the datasets by an integrated strategy including some functions of samr, wgcna, cytoscape and other packages. we selected some degs at two different state and constructed two dcns. through the comparisons between two dcns of two different states, we found some hub genes associated with the bladder cancer. the datasets of gene expression affymetrix microarray of bladder cancer were download from the geo database of ncbi. it has samples, are from the normal tissues and the other samples are from the cancer tissues. the datasets were processed by an integrated strategy. some degs were selected and constructed two dcns. in the end, simple analysis was applied to the two dcns and some hub genes were found through comparing two dcns at different states. normalization of the microarray datasets in order to get high-quality and strong-expression genes for the convenience of the following data processing, we normalized the microarray dataset using medians. after normalization, the expression values are in the better order. we also discovered the distribution of the expression datasets. selection and bioinformatical analysis of degs after preprocessing the microarray datasets including the above normalization, some degs were selected using the r package "samr" = 100; del = ). up-regulated genes and down-regulated genes were picked out. next, we did bioinformatics analysis of those degs including go function enrichment using a online tool amigo. a part of go enrichment results are showed in figure from the go function enrichment results, we can easily find that the main functions of the up-regulated degs are nitrogen compound metabolic process, heterocycle metabolic process, cellular aromatic compound metabolic process and regulation of metabolic process. and the down-regulated degs mostly involve ion binding,multicellular organismal process, single-multicellular organisms process and response to stimulus etc. we used a online tool gather to do pathway enrichment. the up-regulated degs mostly involve the pathways related to "metabolism". however, the down-regulated degs are included in toll-like receptor signaling pathway, gamma-hexachlorocyclohexane degradation besides two metabolism-associated pathways. we also investigated the clustering of the degs associated with the bladder cancer. the heatmap of the degs is showed in figure from the heatmap, we can see the clustering results of the degs. the clustering method can correctly divide the samples into two classes. construction and analysis of two dcns we first calculated a adjacency matrix of degs at the normal or cancer state using the method "pearson correlation" based on the gene expression values. if the adjacency value of a gene pair is greater than, the two genes will be connected to be an edge of a dcn. we constructed a normal dcn and a cancer dcn employing the package "wgcna". we can easily see that in the cancer state, a few up-regulated degs take parts in the co-expression relations. through the comparisons of two different dcns, we found that the shortest path length distribution has a little changes from the normal to the cancer. however, the average clustering coefficient distribution and the topological coefficients under the normal condition differ from those under the cancer conditions, which indicates that the modules in the two different dcns have a lot of changes. next, we want to find differential links between two different dcns. at first, we set two thresholds t and t if the correlation of a gene pair is less than t at normal state, but is bigger than t at the cancer state, the link of the gene pair is defined as a differential link. we computed the significance of the differential links using permuation test. all the selected differential links can compose a differential co-expression subnetwork. in figure the blue ellipses represent down-regulated degs and the red ellipses represent up-regulated degs. the shapes of nodes grow bigger with the degree of nodes. the biggest nodes represent the hub genes: "gdf9", "cyp1a2", "atf7", "trpm3", "cer1", "ptprj", "kcnip1", and "lrrc15". these hub genes mainly involve the biological processes: cellular response to stimulus, regulation of biological process, response to stimulus, multi-organism process, and regulation of localization. and they mostly take part in five pathways: gamma-hexachlorocyclohexane degradation, fatty acid metabolism), adherens junction, tryptophan metabolism, and wnt signaling pathway. among of them, "cyp1a2" and "ptprj" have been reported that they are associated with the bladder cancer. and then, we dectected the modules of a dcn. there have been some methods using clustering algorithms. here, we adopted another good approach. in order to detect the modules of two dcns under different conditions, we begun with calculating the topological overlap matrix of expression datasets. the topological overlap of two nodes reflects their similarity in terms of the commonality of the nodes they connect to. in order test and verify the inference, we clustered the tom similarities from two different conditions. from the additonal file it is obvious that the modules at two different states change greatly. for observing the module changes, we plotted the module heatmaps of degs at two different states and showed parts of the module heatmaps. discussion the development of molecular markers for tumor classification and expression signatures that predict outcome will greatly improve diagnosis treatment of bladder cancer. we employed the r package "samr" to select degs including up-regulated and down-regulated. in the go function enrichment results, it is showed that the main functions of the degs are cellular physiological precess and cellular metabolism, which is reasonable for the bladder. however, it is not clear why the difference between the number of the up-regulated and the down-regulated is so large. we used the r packages "wgcna" to construct two dcns under different conditions. and we used the tool "cytoscape" to visualize and analyze the two different dcns. some hub genes were found and analyzed in view of bioinformatics. the hub genes of the normal dcn mainly involve the neuroactive ligand-receptor interaction pathway and their go functions mostly are response to biotic stimulus,response to stimulus and response to external stimulus. the hub genes of the cancer dcn involve the following three kegg pathways: gamma-hexachlorocyclohexane degradation, fatty acid metabolism and tryptophan metabolism. their main go functions are cellular physiological process, surface receptor linked signal transport and signal transduction. in addition, we found some difference of the two dcns in modules from the clustering plots and the heatmaps. but it is unknown for us that the functions of the different modules, which is our future work. we found several hub genes from the selected differential co-expression subnetwork of two different dcns. two of them have been reported to be associated with the bladder cancer. then whether are the other hub genes associated with the bladder cancer? it need to be validated through biological experiments. in the work, we adopted an integrated strategy to analyzing the bladder cancer-associated genes by combining several r packages, gene ontology and kegg. in the experimental results, it shows that the bladder cancer results from the abnormal signaling pathways caused by many genes. through the data mining for gene expression microarrays, we found differential co-expression subnetwork and the hub genes of the subnetwork. through the main go functions and pathways of the hub genes, we can better understand the development of the bladder cancer, which will support the wet biological experiment and even further promote the prevention, treatment, diagnosis and cure of the bladder cancer in the future.