
The OS for Programmable Science
Understanding the core principles and philosophy that drive Tracer's mission to accelerate scientific discovery through better observability

3 Ways to Slash Bioinformatics Cloud Costs
Current cloud cost monitoring tools fall short for bioinformatics, leading to a clear and growing demand for better solutions
Why is alert investigation for pipelines still so difficult in the age of AI?
Despite advances in AI and machine learning, investigating pipeline alerts remains a frustrating, manual process. Here's why the problem persists and what it would take to fix it.
Converting BAM to BED: A Complete Guide
Learn how to convert BAM files to BED format using bedtools, samtools, and other bioinformatics tools with practical examples.
Merging eggNOG and InterProScan: Best Practices for Functional Annotation
A guide to merging conflicting functional annotations from eggNOG and InterProScan using a Python workflow that preserves confidence flags and avoids common interpretation mistakes.
A Comparative Review of Bioinformatics Pipeline Frameworks
A comprehensive comparison of bioinformatics workflow frameworks (Nextflow, Snakemake, CWL, WDL, Flyte, Prefect, Airflow) with practical guidance on matching tools to team needs.
Right-Sizing with Tracer
A practical guide to reducing AWS costs for bioinformatics pipelines by 40-80% through data-driven right-sizing, with step-by-step implementation using tools like Nextflow, STAR, and GATK.
Diagnose Spark Performance Issues Using Spark UI and Query Plans
Spark jobs running slow but you don't know why? This guide teaches you how to map Spark UI metrics to physical query plans, so you can explain exactly what's causing bottlenecks and validate that your fixes actually work.
Scalable FASTQ QC: Merging, Automation and MultiQC Reporting
Learn how to merge hundreds of FASTQ files, automate FastQC execution in parallel, and aggregate all results into a single interactive MultiQC report.
Superfluid breaks the STAR bottleneck with Tracer and NVIDIA Parabricks
Superfluid Dx partnered with Tracer to benchmark CPU vs GPU-accelerated STAR alignment for Alzheimer's research. Results show 67% faster runtime at near cost parity with NVIDIA L40S GPUs.
CPU: 61% lower pipeline cost. The real bottleneck was never compute.
How execution-level data unlocked 61% more science per dollar without changing hardware accelerators
AI for Data Pipelines: The False Alerts Fatigue Problem
You know the drill: your phone buzzes at 2am. Heart racing, you grab it, only to discover it's another false alarm. The pipeline's fine. You're wide awake for nothing.
The Silent Killer in Your Pipeline: Why You Keep Forgetting the Reverse Strand
A deep dive into one of the most common yet overlooked mistakes in bioinformatics pipelines: failing to properly handle reverse strand sequences. Learn why this happens and how to catch it before it ruins your analysis.
Running NVIDIA Parabricks FQ2BAM on AWS with Nextflow (GPU Instances)
A comprehensive, hands-on tutorial covering setup, execution, validation, benchmarking, and best practices for GPU-accelerated FASTQ-to-BAM processing in the cloud.
Monitoring Airflow with Prometheus, StatsD, and Grafana
How to have real time insights into Airflow, using Prometheus, StatsD, and Grafana.
Bioinformatics Pipeline Frameworks (2025): Nextflow vs Flyte vs Airflow vs Snakemake
Choosing the wrong pipeline framework can break reproducibility and double compute costs. We compare Nextflow, Flyte, Prefect, Airflow, and Slurm using real production bioinformatics workloads—what scales, what fails, and when to use each.
How to Properly Filter Missing Genotypes in bcftools
A practical guide to filtering missing genotypes in VCF files using bcftools, with examples and best practices for handling missing data in variant calling.
Interview with Mathieu Latreille: Bridging Biology and Computation
Mathieu Latreille, biomedical researcher and TechBio entrepreneur, discusses the bottlenecks in modern research: fractured datasets, inconsistent metadata, and the cultural challenges that make collaboration between biologists and computational scientists unnecessarily difficult.
A Practical Guide to Scalable Bioinformatics Workflows on AWS Batch
A comprehensive guide to AWS Batch architecture, its core components, and how it enables scalable batch processing for compute-intensive workloads.
The Most Common Bioinformatics Mistakes and How to Avoid Them
Learn about the most frequent errors in bioinformatics workflows and practical strategies to prevent them.
The Silent Gene Name Corruption Everyone Misses
A Simple and Practical Guide to Stop Excel From Breaking Your Biology
How to Handle Multi-Mapped Reads in Ribo-seq
A comprehensive guide on handling multi-mapped reads in Ribo-seq data analysis.
Differential Gene Expression Analysis: RNA-Seq with Pathway Enrichment
A comprehensive tutorial on RNA-seq differential expression analysis using DESeq2 and pathway enrichment with clusterProfiler. Learn to identify genes and pathways responding to ALDH1A1 inhibition in ovarian cancer cells.
The OS for Programmable Science
Understanding the core principles and philosophy that drive Tracer's mission to accelerate scientific discovery through better observability.
Visibility Is All You Need
An idea for a new high compute monitoring system to improve productivity and efficiency for scientific computing
Uncovering Hidden Cloud Costs
Discover the challenges of monitoring and controlling cloud spend in scientific computing, and what teams can do about it.
3 Ways to Slash Bioinformatics Cloud Costs
Current cloud cost monitoring tools fall short for bioinformatics, leading to a clear and growing demand for better solutions.
Cloud cost monitoring in bioinformatics
Monitoring and understanding costs for scientific workloads running on cloud technology infrastructure such as AWS, remains a persistent challenge that current tools fail to solve.
Hackathon Day Four: Thursday, June 5th
Tracer's beta launch is nearly here - today we shipped onboarding, fallback tracing, branch-based installs, and better OOM insights.
Hackathon Day Three: Wednesday, June 4th
Kenya Day Three: A well-earned break, a tour through Nairobi's rich history, and rooftop views before diving back into build mode.
Hackathon Day Two: Tuesday, June 3rd
Kenya Day Two: Tracer runs natively on Mac ARM, the blog goes live, and we're learning why having the right foundation matters.
Hackathon Day One: Monday, June 2nd
A hackathon kick-off note from Laura, our COO, and records from our first day in Nairobi, Kenya.
Tracer Glossary
Comprehensive definitions of key terms in bioinformatics, computational biology, and scientific computing.




