Visualizing Bulk RNA-seq Data Using Phantasus

Bulk RNA-seq refers to a sequencing approach where the gene expression from a population of cells is averaged to check for RNA presence and quantity in a sample of cells during the time of measurement. It is the preferred technique for a transcriptomic investigation of tissue slices, biopsies, or pooled cell populations.

Once bulk RNA-seq data has been processed, there remains the essential process where the biology is explored, visualized, and interpreted. Without a visualization and analysis tool, this step can be time-consuming and laborious.

In this blog, we talk about Phantasus, a web application for the visualization and analysis of datasets, and how its integration with Polly helps in easy bulk RNA-seq data analysis.

What Is Phantasus?

Phantasus is a user-friendly web application for interactive gene expression analysis. It simplifies data analysis by offering a seamless approach, from loading, normalizing and filtering the data to performing differential gene expression and downstream analysis. Phantasus integrates an intuitive heatmap interface with gene expression analysis tools to achieve this. The tool supports R-based methods such as k-means clustering, principal component analysis, or differential expression analysis with limma package.

What Is a Heatmap?

A heatmap is a graphical representation of data that uses a system of color coding to represent different values. For example, assume there are 20,000 genes listed in rows, with the conditions listed in columns. Each gene under every condition represents a certain value. If we just list the numerical values of each gene corresponding to every condition, it will be difficult to differentiate between the genes. So instead of numbers, we use colors. Each color represents a range of numbers, so when the heatmap is plotted, we get an idea about the behavior of genes under different conditions.

In case of Phantasus, the heat map represents gene expression data under different conditions and other parameters listed.

How Does Phantasus Help in Data Analysis?

In simple terms, Phantasus is an application that takes input data about the genes in GCT (Gene Cluster Text) file format and generates a heatmap for these genes with respect to certain conditions and parameters like cell type, cell line, etc. that are listed in the metadata. Using this tool, we can easily analyze the gene data, differentiate between them, and find the group of genes that matches our study interest. Various statistics and differential expression techniques are used to find the difference between the genes.

Loading public datasets from Gene Expression Omnibus with both microarrays and RNA-seq datasets being supported.
Differential gene expression using limma or DESeq2.
Publication-ready plots with export to SVG: PCA plot, row profiles, box plots.
Clustering: k-means and hierarchical.
Gene set enrichment analysis.
Pathway enrichment analysis.

Phantasus on Polly:

Polly, a data-centric ML Ops platform, hosts OmixAtlas, a data warehouse with millions of datasets from public, proprietary, and licensed sources. Phantasus can be used directly from Polly OmixAtlas. The highly curated datasets on Polly allow seamless integration of the Phantasus app, and data can be analyzed readily without the need for preprocessing. Any dataset can be opened on this application on Polly, and a corresponding heatmap will appear.

The app loads data, normalizes it, and filters outliers to perform differential expression and other downstream analyses like plotting PCA plots or pathway analysis.

Visualizing a dataset on Polly-Phantasus

The figure above shows the visualization of a dataset in the form of a heatmap on Phantasus.

On the heatmap, the rows correspond to genes (or microarray probes). The rows are annotated with Gene symbol and Gene ID annotations. Columns correspond to samples.
Phantasus on Polly uses data_type from dataset metadata from the atlas and checks whether the ‘data_type’ has the value ‘RAW COUNTS TRANSCRIPTOMICS.’ When the condition is matched, it uses the VST normalization method to perform normalization before loading the visualizations. The aim of normalization methods for large-scale expression data, including microarray and RNA-seq, is to eliminate systematic experimental bias and technical variation while preserving biological variation.
Variance stabilizing transformation (VST) aims at generating a matrix of values for which variance is constant across the range of mean values, especially for low mean.
Phantasus application uses normalized data to draw all visualizations. It has been integrated on Polly and can be used directly on OmixAtlas like GEO Raw Counts OmixAtlas, and Bulk RNASeq OmixAtlas.

Polly hosts the world’s largest collection of highly curated, ML-ready bulk and single-cell RNA seq data. Our curation pipelines, high-quality, accurately annotated data, standard workflows, and scientific expertise are used by industries and academia across the globe to accelerate their drug discovery process. Reach out to us to learn more about how to accelerate your research!

‍

‍

Blog Categories

Data Analysis and Management

Data Quality & Compliance

Industry Features

Product & Engineering

Data Science & Machine Learning

Company & Culture

FAIR Data

Others

Thank you for reaching out!

Our team will get in touch with you over email within next 24-48hrs.

Oops! Something went wrong while submitting the form.

Other Resources

Case Studies Dataset Roundup Documentation Glossary Solution Briefs Webinars Whitepapers

Upcoming Webinar - Agentic AI Delivers Human-Accurate Biomedical to Accelerate Precision Medicine

Join us

[Upcoming Webinar] Scaling High-Quality Data Processing: Achieve 4x Cost Reduction for Foundation ModelsRegister Now->

Reserve Your Seat

Visualizing Bulk RNA-seq Data Using Phantasus

What Is Phantasus?

What Is a Heatmap?

How Does Phantasus Help in Data Analysis?

Phantasus on Polly:

Blog Categories

Talk to our Data Expert

Other Resources

Related Blogs

Navigating the Future of Healthcare AI: Opportunities, Challenges, and Ethical Considerations

Clinical Trials Data: Best Practices for Effective Analysis and Integration

AI Agents in Healthcare: Real Use Cases, Benefits, and How to Deploy Them Effectively

Scalable Infrastructure for Biomedical Data: Best Practices and Common Pitfalls to Avoid

Understanding Knowledge Graphs: Definition, Benefits, and Best Practices

Visibility Is Power. Preprints Make It Instant.

Blog Categories

Get the latest news, industry insights, and updates delivered directly to your inbox.

Latest Blogs

Navigating the Future of Healthcare AI: Opportunities, Challenges, and Ethical Considerations

Navigating the Future of Healthcare AI: Opportunities, Challenges, and Ethical Considerations

Clinical Trials Data: Best Practices for Effective Analysis and Integration

Clinical Trials Data: Best Practices for Effective Analysis and Integration

AI Agents in Healthcare: Real Use Cases, Benefits, and How to Deploy Them Effectively

AI Agents in Healthcare: Real Use Cases, Benefits, and How to Deploy Them Effectively

Scalable Infrastructure for Biomedical Data: Best Practices and Common Pitfalls to Avoid

Scalable Infrastructure for Biomedical Data: Best Practices and Common Pitfalls to Avoid

Understanding Knowledge Graphs: Definition, Benefits, and Best Practices

Understanding Knowledge Graphs: Definition, Benefits, and Best Practices

Visibility Is Power. Preprints Make It Instant.

Visibility Is Power. Preprints Make It Instant.

Trending Blogs

Clinical Trials Data: Best Practices for Effective Analysis and Integration

EHR Data: Transforming Healthcare through Standardization and Innovation

Scaling Data Pipelines for High-throughput Bioinformatics

Decoding Complexities: The Critical Role of Deconvolution in Spatial Transcriptomics

Challenges with Diagnostics Data Processing Pipelines

info@elucidata.io

info@elucidata.io

info@elucidata.io