De novo transcriptome of Phakopsora pachyrhizi by Illumina short-read sequencing

Phakopsora pachyrhizi is an obligate biotrophic fungal pathogen of soybean that causes Asian soybean rust (ASR), a devastating disease that can cause yield losses of 80% or greater. P. pachyrhizi secretes an arsenal of effector proteins to manipulate host immunity and promote disease. Current knowledge of the P. pachyrhizi genome is limited and only a small number of the total P. pachyrhizi effectors have been identified. We therefore sequenced the transcriptome of P. pachyrhizi during infection to identify P. pachyrhizi Candidate Secreted Effector Proteins (CSEPs). Total RNA was extracted from P. pachyrhizi-infected soybean leaf tissue collected at 3, 7, 10, and 14 days after inoculation (dai). Strand-specific cDNA libraries were prepared from ribosomal depleted RNA using NEBNext Ultra II RNA library prep kit according to the manufacturer's instructions. Paired-end short-read Illumina sequencing was performed on dual-indexed cDNA libraries using HiSeq3000 (150bp from each end; https://www.illumina.com/) and MiSeq (300bp from each end; https://www.illumina.com/). The raw reads obtained from Illumina short-read sequencing were quality assessed using FastQC v.0.11.2. Paired-end read trimming was conducted by Trimmomatic 0.36 using sliding window 4:15 and excluding read below a minimal length of 36. Trimmed paired-end RNA-Seq reads from 3, 7, 10, and 14 dai were mapped against the soybean genome v2.1 (https://plants.ensembl.org/Glycine_max) using STAR 2.5.3a aligner to remove soybean reads. The non-soybean short-reads were de novo assembled using Trinity v2.6.6. BLASTN (Basic Local Alignment Search Tool) search was performed on non-soybean transcripts using Blastplus v2.6.0 (NCBI: National Center for Biotechnology Information) to remove any plant transcripts with a query coverage greater than 80% and identity greater than 95%. The final de novo transcriptome containing non-plant, non-soybean transcripts for each time point was used for prediction of candidate effectors.