Frequently Asked Questions


How do I run PolyPhred on a MacIntosh computer?

PolyPhred, as well as Phred, Phrap and Consed, run on the UNIX or Linux operating system. The Mac OSX operating system provides a feature that allows one to run such programs within a UNIX-like environment. To do this, you must open a "terminal window" using the following steps: open the Applications folder, then the Utilities folder, and click on the Terminal application.

Within the terminal window, you need to enter UNIX commands to set up directories, move files around, and run the programs. If you are unfamiliar with UNIX, here is a brief tutorial to help get started. Then follow the instructions in the PolyPhred documentation under "Installing PolyPhred", and below that, "Running PolyPhred".


Why does PolyPhred miss SNPs in a data set with only a few reads?

Before PolyPhred searches for SNPs, it carries out a processing step. During this phase, PolyPhred calculates at each position in the alignment an average homozygous peak. To do this, it uses all of the sites in each read that appear initially to be homozygous.

During the search phase, PolyPhred compares each site against these average homozygous peaks. Significant differences from the average contribute to a high score for heterozygosity.

If the data set contains only a few reads, then the average homozygous peaks might not represent a true average at some positions. This is particularly true if one is trying to analyze data sets with only one or two reads. In this case, at positions where there are no homozygous sites represented, an average homozygous site will not be calculated, and PolyPhred will fail to find the heterozygotes at that position.

We have not conducted a rigorous study to determine the optimal number of reads to include in a data set. Based on statistics, it can be assumed that the more reads there are (to a point), the better the average. We have found that with a data set containing eight reads, most of the known SNPs were found.

If one wishes to analyze only one or a few traces at a time, one could include in the data set several traces from a source that is known to be homozygous at all positions. These traces should come from independent sequencing runs.


When I turn indel detection on, Consed reports an error and does not run. How do I fix this problem?

When PolyPhred identifies a putative indel site, it inserts an 'indelSite' tag in the ace file and 'indel' tags in some of the phd files. The current version of Consed is not able to interpret these tags without customizing the .consedrc file. Among other things, this file allows for the specification of user-defined tags.

To edit the .consedrc file, one must first locate the file, or if it does not exist, create one. The easiest way to determine if the file exists and where it is located is to ask the person who installed Consed. If this is not possible (or if that person is you), then you will need to locate the file or create it yourself.

First, try locating the file using the following procedures.

1) Type this line:

 env | grep CONSED
If this line is appears:
 CONSED_PARAMETERS=[path]
where [path] is the directory containing the .consedrc file, then you have located the file. Skip to 'Editing the .consedrc file' below.

2) Look in your home directory. Type:

 cd
 ls -a
If .consedrc is listed among the files in your home directory, you have found it. Skip to 'Editing the .consedrc file' below.

3) Type:

 slocate .consedrc
If this works, it will show you where a .consedrc file is located. Skip to 'Editing the .consedrc file' below.

4) Look to see if it is with the Consed executable file. Type:

  where consed
Typically, the executable is in a directory like:
 /usr/local/genome/bin/
If this is the case, then type something similar to:
 cd /usr/local/genome/
 find . -name .consedrc -print

Editing the .consedrc file

Once you have located a .consedrc file, you are ready to edit the file. If you could not locate the file, then you will need to create it. You can create the file in your home directory, which will give access only to you. Or you can create it in a 'global' directory like /usr/local/genome/ so that other Consed users can access it.

To edit the .consedrc file, open it with your favorite text editor. Add the following lines:

 consed.customConsensusTag1: indelSite
 consed.tagColorCustomConsensusTag1: DarkCyan
 consed.customTag1: indel
 consed.tagColorCustomTag1: DarkOrange
If the tags already exist in the file, then change the finally '1' to a different number to make the tags unique.

Finally, if the .consedrc file is located in a directory other than your home directory, you need to add a line to your shell script that tells Consed where to find it. Locate your shell script in your home directory and add one of these lines (where [path] is the complete path to the .consedrc file).

For csh
setenv CONSED_PARAMETERS [path]

For bash
CONSED_PARAMETERS=[path]
and add CONSED_PARAMETERS to the 'export' line.


What is the format of the .poly files?

The .poly files are written by the program Phred when the -dd flag is used. These files provide additional information that PolyPhred needs to identify putative SNPs.

The first line of the poly file contains the name of the corresponding trace file, followed by five numbers. The first number is the minimum of the following four numbers. These four numbers are scaling factors for the A trace, C trace, G trace and T trace, respectively.

The remaining lines have information for each called base in the sequence. The fields are:

  1. the primary base
  2. the location of the peak corresponding to the primary base (matched in the phd file).
  3. area under the primary peak
  4. area under the primary peak relative to surrounding peaks
  5. the secondary base
  6. the location of the peak corresponding to the secondary base
  7. area under the secondary peak
  8. area under the secondary peak relative to surrounding peaks
  9. height of the A peak at the primary peak location
  10. height of the C peak at the primary peak location
  11. height of the G peak at the primary peak location
  12. height of the T peak at the primary peak location


Return to the main PolyPhred page