Illumina Next Generation Sequencing

Sample Submission

Prior to dropping samples or libraries off in the lab, please ensure that a GIGPAD batch has been created ( and please include the batch paperwork with any samples.

Please send samples for library construction only OR for library construction and sequence detection only to:

Jessica Yi
Partners HealthCare Center for Personalized Genetic Medicine
65 Landsdowne Street, Suite 350
Cambridge MA 02139
Phone: 617-768-8430

The package should be clearly labeled with the sender's contact details including telephone number and should also include the GIGPAD batch paperwork. Please address all inquiries for library construction samples to and

Please send libraries for sequence detection only to:

Sara Graber
Partners HealthCare Center for Personalized Genetic Medicine
65 Landsdowne Street, Suite 350
Cambridge MA 02139
Phone: (617) 768-8435

The package should be clearly labeled with the sender's contact details including telephone number and should also include the GIGPAD batch paperwork. Please address all inquiries for sequence detection libraries to and

More Information

Sami Amr or 617-768-8377, or Alison Brown at or 617-768-8470 for more information.

PCPGM offers next generation sequencing services on Illumina's HiSeq2000, using massively parallel sequencing-by-synthesis technology to generate DNA sequence data with unprecedented throughput and accuracy. The HiSeq2000 produces approximately 100M reads per lane. More information can be found at Illumina Website.


This technology is often used for whole genome and targeted resequencing, but has a variety of applications, including:

  • The study of interactions between proteins and DNA via chromatin immunoprecipitation (ChIP-seq)
  • Expression profiling in bacteria or mammals using digital gene expression (GEX)
  • The exploration of micro-RNAs and other ribonucleic acids via RNA-seq techniques
  • Detection of chromosomal rearrangements, sequencing of repetitive regions, or de novo assemblies using paired end sequencing

In addition to sequence detection, PCPGM offers library construction from:

  • Genomic DNA
  • Total RNA and mRNA
  • shRNA

Investigators are urged to schedule a consultation to help select the optimal method for their projects. Inquiries can be directed to Sami Amr ( or Alison Brown (


Sample Submission Requirements

All Samples submitted will be QC’d within 48 hours of arrival to check concentration and purity by OD using a Nanodrop 1000 and Picogreen assay using a Qubit analyser, and in the case of total RNA, RIN will be analysed using Agilent Bioanalyzer chips or Tapestation tape. Any sample failing QC will be returned to the customer for replacement. Where no replacement is available, we will process the failing samples but cannot guarantee the success of the library construction and no attempt to troubleshoot can be made.


Requirements for Library Construction

Whole Genome library construction - we require a minimum of 1 ug starting material, at a concentration of over 50 ng/ul as assessed by picogreen. Please contact us if your DNA quantity is limiting. The purity of the sample should be measured by the 260/280 OD ratio, and should be more than 1.8. A gel picture is required to measure the quality of the submitted sample. Our library construction method uses Beckman Coulter SPRIworks with nextFlex adaptors, followed by PCR enrichment and size selection using SAGE Pippin Prep.

RNA-Seq library construction - All total RNA samples should be in aqueous solution (DNase and RNase free water, not DEPC), should be DNA-free, and if RNA is extracted using TRIZOL or other organic reagent, the final clean up step must be to use an RNAeasy or similar column method. The sample should have 260/280 of 1.75-2.0, 260/230 of 1.75-2.0, RIN of > 7.
Two methods of RNA-Seq library construction are offered, depending on the amount of total RNA available. For lower amounts of total RNA (500 pg – 100 ng in a maximum volume of 4 ul), we use the Nugen Ovation RNA-Seq System v2 for cDNA generation. Library construction can then be carried out using SPRIworks as described above. For higher amounts of total RNA (0.1-4 ug in a maximum volume of 50 ul) we use the Illumina TruSeq RNA-Seq kit.

shRNA library construction - The quantity of genomic DNA required for library construction depends on the complexity of the screen being carried out. Please refer to our shRNA service page for more details on DNA amounts.

Customer-Constructed Library Submissions
The most common customer supplied libraries sequenced at the PCPGM NGS service are SAGE/DGE, ChIP-Seq, pooled shRNA, concatenated PCR amplicons and Genomic shotgun libraries. Researchers are encouraged to QC and normalize their libraries to a concentration of between 20-100 nM, with a minimum volume of 20 ul, resuspended with clear buffers or water. Concentration should be assessed by picogreen quantitation only. A gel picture/bioanalyzer profile must also be provided to show the size of the library. All libraries must be purified using QIAquick PCR purification kit (Qiagen, Cat# 28104) or by Agencourt Ampure XP (Beckman, Cat#A63881). No other kits can be substituted.

Library QC
All libraries are QC’d by picogreen using Qubit, then normalized to 10 nM, followed by qPCR using P5 and P7 adaptors to measure the amount of library containing the correct adaptors. The concentration is then further adjusted in order to produce a good number of clusters in cluster generation. Any library failing QC will be returned to the customer, or can be run with no guarantee of read number.

Guaranteed Read Number
HiSeq 2000 flow cell: We guarantee 100 million clusters per lane of raw unfiltered data.
MiSeq flow cell: We currently guarantee 10 million clusters per lane of raw unfiltered data.
Any high quality, high diversity library not reaching this level will be offered a free rerun of that same library. If any issues are seen during sequencing with the quality of the library, we will inform the customer immediately.

Please see our NGS Frequently Asked Questions to ensure how to reduce issues with low diversity libraries and how to ensure balanced indexes are used.

Turnaround time
Turnaround time is always of concern to our customers. We work on a first come, first served basis and queue libraries only after passing QC. Our most frequent run is PE50, with a launch at least once every 2 weeks. Faster turnaround time can be offered to customers who fill full flow cells (7 lanes), these can generally be launched soon after QC. We are able to run paired end libraries on both paired end and single end flow cells, so if you require a single end run, putting it onto a paired end flow cell may speed up the return of data.

Please bear in mind the amount of time taken to generate data. For example,

  • a PE50 flow cell is onto the HiSeq 2000 for around 5 days
  • a PE100 flow cell is on the HiSeq 2000 for around 12 days

This means that is can be around 2-3 weeks after launch of the flow cell before data will become available to the customer.

Next Gen Seq Bioinformatics Support

Basic service for all Sequencing
As part of the Sequencing service we offer standard bioinformatics analysis on all the samples we receive. The various options that are available to the customers, along with the requirements are listed below. Fulfilling all the requirements ensures timely return of processed data. Any change in service to that which was selected in GIGPAD will cause delay in data return and possibly an additional fee.
Under each option, the types of data that are deliverable and the mode of delivery are listed. The user must satisfy the requirements and mention the type of data, along with the mode of delivery with their order.

  • Base Calls: Fastq formatted sequence files are generated for all base calls. This format can be easily converted to other formats.
  • Base Calls+Alignment: ELAND is used to perform alignment on your samples using a reference sequence provided by the customer. This will allow users to get an initial estimate on the quality of the data.

Delivery Options


Deliverable Data

Mode of Delivery

Base Calling


FASTQ Sequence, summary

Partners File Transfer

Base Calling + Alignment

Reference Sequence/Files

FASTQ Sequence, export summary

Partners File Transfer Utility, FTP


RNA-Seq Report

The resultant fastq files are further analyzed by viewing FastQC HTML reports that summarize various metrics related to sequence quality and content. Subsequent analysis is carried out as follows:

  1. TopHat is used to align sequencing reads to a reference genome
  2. A custom HTML-based alignment QC report is generated using output from various tools (e.g. SAMtools, Picard tools) and includes the number of bases assigned to various classes of RNA, the number of reads mapped to each chromosome, and 5’/3’ bias for each sample.
  3. Cufflinks is used to assemble transcripts
  4. Cuffdiff is used to estimate transcript abundances and perform differential expression tests
  5. A custom HTML-based differential expression report is generated using output from CummeRbund. Snapshots of raw reads for some genes are provided based on IGV plots.

For libraries that have been constructed at our facility, we add an ERCC Spike-in Mix (Ambion) to each RNA sample to serve as a sample-independent control to detect outliers and assess QC across experiments. A summary diagram of our RNA-Seq pipeline is shown below.


Before any sequencing or library construction can be carried out, a completed batch in GIGPAD is required. Help documents for setting up studies and batches in GIGPAD can be found here. Please be aware that order entry requires a PC version of MS Excel 2003 and will not work when used with a Mac or Linux system. Please contact Alison Brown if you have any difficulties.

Please see also see our NGS Frequently Asked Questions to ensure that indexes are correctly entered into our LIMS to ensure fast turnaround time for demultiplexing of pooled libraries.