| GST |
| ID |
The id number of the GST. Format is CATMAxyzzzzz,
where x is the chromosome number and y is the version letter. |
| Repository version 1-2 | GST letter is 'a' or 'b'; 'letter b' probes are
re-designed and improved 'letter a' probes; control probes have 'ctrl' in their name. |
| Repository version 3 | GST letter is c. |
| Repository version 4 | GST letter is d. |
| GST Location |
The position of the GST within the gene model.
e.g. exon 2-2/4 means completely within exon 2 out of a total of 4
exons, while exon 4-5/10 means that the GST starts in exon 4,
spans an intron, and finishes in exon 5 of 10. |
| Repository version 1-2 | The value originally given by SPADS
at the time of design, and thus referring to the structure of the used design template (in most cases
equal to the exon structure of the transcript sequence given in gene_sequence). |
| Repository version 3 | The value presented was calculated based on the
-3' artificial UTR extended- transcript sequence of the target gene (TIGR5 or Eugène040917).
In case of target genes with multiple splice variants (only for TIGR5), an NA value is given. |
| Repository version 4 | The value presented was calculated based on the
-not 3' artificial UTR extended- transcript sequence of the target TAIR6 gene. In case of target genes with multiple
splice variants, an NA value is given. GSTs of v4 were not restrained to start or stop within an exon, if
starting before an exon the '<'; sign is added before the exon number, if stopping after an exon the '>'; sign is added
after the exon number. |
| GST Type |
E means completely within one exon, I means spanning or overlapping with an intron.
'1' means a similarity < 40%, '2' a similarity between 40% and 70% and '3' a similarity > 70% |
| Repository version 1-2 | Value originally given by SPADS, with the number corresponding
to the initial similarity value. Mind that this similarity value was now re-calculated. |
| GST Intron % |
The percentage of the GST corresponding to an intronic region |
| Repository version 1-2 | The intron percentage originally given by SPADS at
the time of design. This percentage is with respect to the structure of the used design template (in most cases
equal to the exon structure of the transcript sequence given in gene_sequence). |
| Repository version 3 | The intron percentage was calculated based on the
-3' artificial UTR extended- transcript sequence of the target gene (TAIR6 or Eugène040917). In case of target
genes with multiple splice variants (only for TAIR6), an average value is given. |
| Repository version 4 | The intron percentage was calculated based on the
-not 3' artificial UTR extended- transcript sequence of the target TAIR6 gene. In case of target
genes with multiple splice variants, an average value is given. |
| GST Similarity % |
GST specificity expressed as percentage of sequence identity
in the best non-trivial blast hit, i.e. not matching the target gene sequence,
using the genome sequence (
ATH1_chrX.1con.01222004, X=1,2,3,4,5) as a blast database. |
| Repository version 1-2 | Some GSTs could not be mapped to a current gene model
, which is why the sequence identity of the second best blast hit was taken without performing
the triviality check. |
| Repository version 3 | Values were taken from SPADS output. Triviality was checked
taking into account the minimum and maximum coordinate of the target gene transcript. SPADS also takes into account
the sequence identity of the second-best, third-best and fourth-best non-trivial blast hit, in case they are
within close proximity of the best non-trivial blast hit. |
| Repository version 4 | Values were calculated manually, triviality was checked
taking into account the minimum and maximum coordinate of the target gene transcript |
| Baldino Flag |
Advisory flag warning for potential off-target hybridization. Any GST that has a blast hit
with a calculated hybridization temperature equal to or higher than 45 degrees Celcius is flagged. The Tm calculation was performed
according to the formula Tm = 16.6(log molar concentration Na+) + 0.41 (%GC) + 81.5 - 675/length - 0.65(percentage formamide) -
%mismatch (~ "Baldino, F., Jr., Chesselet, M. F. & Lewis, M. E. (1989) Methods Enzymol. 168, 761-777", %Formamide = 50% and [Na+] = 0.666M).
The trivial blast hit was not taken into account and neither those blast hits entirely enclosed by an exon of a target gene of the GST
(last criterion only used for GSTs of class GST3,GST4 and GST5). |
| GST Start and Stop |
The start and stop coordinates of the probes' blast hit on the
ATH1_chrX.1con.01222004
sequence with X the corresponding chromosome number |
| Repository version 1-2 | These GSTs were originally designed using an
older release of the Arabopsis sequence. The consequence is that for a small fraction of these probes the blast hits
on the latest Arabidopsis sequence release do not cover 100% of the probe sequence.
The difference between stop and start will in these cases be smaller than the recorded probe sequence length. |
| GST 96 Well Plate/GST Coordinates |
The CATMA GSTs of version 1, 2 and 3 are stored on 317 96-well mother plates,
numbered from number 96101 until 96480. The first two digits are always '96', the last 2 digits
form the number of the corresponding group of 96-well plates and the third digit represents the
ordering number (to be) used when rearraying this plate group onto 384-well plates. . GST Coordinates format
is xy(y),where x and y(y) are row letter and
column numbers respectively. |
| Repository version 4 | These probes are stored separately, together with the Gene Family Tags, at
UNIL . No GST 96 Well Plate/GST Coordinates information is available. |
Gene Sequence |
The + strand of the transcript sequence of the targeted gene |
| Repository version 1-2 | The transcript sequence of the target gene as it was known at design time. For
consistency reasons no upgrade was performed of this transcript sequence. Moreover, some GSTs could not be mapped to a
current gene model. This mapping information is stored in the gene_mappings table and in the gene_mapping field of this table. |
| Repository version 3 | When different splice variants were available, the GST was designed upon an
intersected gene model. Nevertheless the -not 3' artifical UTR extended- transcript of only one of the possible splice variants is shown. |
| Repository version 4 | When different splice variants were available, the GST was designed upon one
of the possible splice variants. The -not 3' artifical UTR extended- transcript of this chosen splice variant is shown here. |
| Amplification Results |
Results of the primary PCR amplification of the GST. B means amplification from BACs, G from genomic. 0 means no
product was detectable in gel electrophoresis analysis, 1 is a product of the right size, 2 is a smear or multiple bands and 3 means the
product appears to be of the wrong size. |
| Repository version 4 | Currently no complete amplification results are available. |
| Sequence Verified |
A small percentage of the CATMA GSTs have been verified by sequencing.
This field indicates whether a particular GST has been sequence validated.
|
| Model Type |
Indication of the Arabopsis genome annotation version of the original target gene. Possible values are
'TIGR5', 'TAIR6' and 'EUGENE170904 ' |
| Repository version 1-2 | Version 2 probes were either designed based on TIGR3, TIGR4 or an earlier
Eugène annotation release. Information about the exact annotation used was not retrieved and the value for this column is left blank. |
| Template Sequence |
The sense strand of that part of the transcript sequence used for GST design. |
| Repository version 1-2 | Information about the used template sequence was not retrieved and the value for this column is left blank. |
| Repository version 3 | The template sequence was derived as such: 3' artificial UTR extension of 150 bp if no 3' UTR was present in the original transcript; regions overlapping with transcripts of other genes were taken out; in case of splice variants an intersection was taken between the different gene models. |
| Repository version 4 | Representative Family Sequence (RFS) was used as template sequence. The RFS design is described in Sclep, G. et al. 2007. |
| Primers |
| TM |
Melting temperature calculation using nearest neighbour (NN) method as performed by Primer3
(Rychlik, Spencer and Roads, Nucleic Acids Research, vol 18, no 21, page 6410, eqn 2 with NN table from
Breslauer, Frank, Bloecker and Markey, Proc. Natl. Acad. Sci. USA, vol 83, page 3748, table 2). |
| Start and Stop |
Start and stop of primer with respect to the template sequence. The start coordinate of the 3' primer is always higher than the stop coordinate. The start coordinate of the 5' primer is always lower
than the stop coordinate. |
| Repository version 1-2 | As the template sequence is empty for these probes, the primer start and stop coordinates
were consistently left blank. |
| Gene Mapping |
| TAIR 7 |
Comma separated list of TAIR7 AGI code(s) of the nuclear protein-encoding gene model(s) tagged by GSTs (mapping classified as GST3, GST4 or GST5, see Sclep, G. et al. 2007 for more
details on the mapping algorithm). Taking into consideration that a small fraction of GSTs does not tag all the alternative splice forms of a certain gene, the name of the gene model is given instead of the gene name. When no TAIR 7 gene models are tagged, the field is left blank.
|
| TAIR 7 Gene Description |
A list of textual descriptions of the genes listed in the 'TAIR 7' field. One description is given per gene, not distinguishing between different gene models/splice variants. In case of multiple genes (genes, not gene models) listed in the tair_7 column, the different gene descriptions are separated by a ‘@’ character. The descriptions correspond to a concatenation of the ‘COM_NAME’ and ‘PUB_COMMENT’ fields from the TAIR7 annotation files. When no TAIR 7 gene models are tagged, the field is left blank.
|
| Eugene 040917 |
Comma separated list of Eugène040917 IDs of the nuclear protein-encoding gene model(s) tagged by GSTs (mapping classified as GST3, GST4 or GST5, see Sclep, G. et al. 2007 for more details on the mapping algorithm). When no Eugène040917 gene models are tagged, the field is left blank.
|
GST Class |
GST class code when GST was mapped against TAIR7 and Eugène040917 gene models collectively. When taking more genome
annotations into account, the likelihood of finding a cognate gene -and thus of classifying a probe as GST5- obviously increases. A GST of class GST5 can be considered as sufficiently covering its target gene, without showing risk of
cross-hybridization. See Sclep, G. et al. 2007 for more details on the
mapping algorithm. |
| Repository version 4 | Due to a different design process for the v4 repository,
some probes overlap over less than 100 bp with exonic regions of their target gene. The mapping algorithm classified
these probes as GST2. Where for the other repository versions the corresponding genes were not listed in case of a
GST2 classification, an exception was made for the v4 GSTs. |