Gene Signature comparisons with available datasets have proven to be a powerful technique utilized by biopharma R&D teams for drug discovery, biomarker identification, development, and personalized medicine.
This technique allows researchers to analyze the expression levels of large numbers of genes in samples from individuals with a particular condition or disease and compare it to a conserved cluster of genes whose expression levels are most strongly associated.
This gene signature can then be used to search public databases of gene expression data for other drugs or compounds that can revert the disease signature, indicating a potential therapeutic effect.
However, extracting associated signatures from public databases can be challenging due to various processing pipelines, syntaxes, schemas, and metadata annotations used at the source. We address these challenges through Polly’s RNA-Seq Omixatlas.
This blog discusses how users can compare signatures using Polly's RNA-Seq OmixAtlas.
Polly is a biomedical data platform for life sciences R&D, primarily delivering bulk & single-cell RNA-seq data, along with 24 other data types. It delivers 155 TB of FAIR and ML-ready biomedical data from ~30 different public and proprietary sources to customers. Polly’s RNA-Seq OmixAtlas (OA) contains curated RNA seq datasets collected from Gene Expression Omnibus (GEO). This richly curated resource provides a good base for researchers looking to find datasets with similar transcriptional profiles to their gene sets of interest.
All datasets on Polly are:
The first step to compare Gene Signatures is to create a query wherein the gene of interest can be searched against a dataset to identify a closely associated gene cluster. To generate a query signature, the following steps are required:
Or Polly experts can be contacted that will work with your scientists to customize these steps as needed and capture transcriptome profiles and generate queries (gene signature vectors) that will run on Polly’s signature database. The query will consist of gene clusters that were significantly differentially expressed in the experiment with Log Fold Change, p-values, and adjusted p-values
Example of a query: Given an input of gene set and Log Fold Change values, search for all datasets that show maximum cosine similarity scores with the input genes and their differential expression results.
This signature database can now be queried to identify datasets with similar transcriptional profiles to the Query Signature. For instance, users can run complex SQL queries to identify:
We used signature reversal and multivariate gene expression signatures to identify potential drug combinations for COVID-19. To do this, publicly available transcriptomics data from COVID-19 studies and drug signatures from LINCS were compiled, processed, and curated. All datasets were ingested through Polly's proprietary curation pipeline, enriched with ontology-backed metadata, and engineered to a query-able .gct format.
Want to perform gene signature comparisons effectively? Talk to us!
Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.
Polly provides access to a curated repository of RNA-seq datasets that are consistently processed and enriched with metadata. This harmonization allows researchers to efficiently search for datasets with similar transcriptional profiles, facilitating transcriptome profiling and biomarker identification.
Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.
Polly utilizes signature reversal and multivariate gene expression signatures to predict potential drug combinations. By analyzing publicly available transcriptomics data and drug signatures, Polly can identify drugs or compounds that may have therapeutic effects by reversing disease signatures.
Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.
Polly ranks similar datasets using cosine similarity scores, which measure how closely a dataset's transcriptional profile matches the query signature. This helps researchers quickly find relevant datasets for further analysis and validation.
Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.
Researchers define the biological process of interest, select a dataset, preprocess the data, identify differentially expressed genes, and validate the signature. Polly’s platform streamlines this process with expert support and ML-ready datasets.
Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.
Polly's RNA-Seq Atlas addresses challenges in extracting associated signatures from public databases by providing a curated resource of RNA-seq datasets collected from the Gene Expression Omnibus (GEO). This richly curated resource helps researchers to find datasets with similar transcriptional profiles to their gene sets of interest.
Lorem ipsum dolor sit amet consectetur. Dictumst faucibus nibh imperdiet phasellus vitae ut sit. Ut eros amet massa tellus orci. Vestibulum ac arcu est nulla non eget nulla. Eget pulvinar eu ac mi cursus elementum neque. Massa nisl fringilla platea diam faucibus nullam. In lacus mauris nec ultrices. Ut accumsan leo adipiscing montes proin.
Gene signature comparison analyzes gene expression patterns to identify disease-related signatures. It helps researchers find drugs that can reverse disease signatures, aiding in therapeutic discoveries.