Data analysis performed in the frame of REGULATORS (Exploiting inter-species conservation in promoter sequences to identify regulators of reproductive development and physiological performance), a Trilateral Co-Operation in Plant Genomics between Spain (MCyT), France (GENOPLANTE) and Germany (GABI) coordinated by G. Coupland (coupland-ad-mpiz-koeln.mpg.de). Authors: Vincent Thareau (IBP-Orsay UMR8618 CNRS-UPS, thareau-ad-ibp.u-psud.fr) and Alain Lecharny (URGV-Evry UMR INRA-CNRS-UEVE, lecharny-ad-ibp.u-psud.fr).
Definition of the terms used to describe the quality of the clone: The about 2250 sequences from the clone collection were sorted according to clones and clustered. If more than one contig was formed, the clone was designated 'Contamination'.
# The contigs and singletons were blasted against CDS plus pseudogenes from the TIGRv5 annotation, and the resulting AGI code is presented if more than 90 percent identity was found.
## The sequences were also blasted against all TIGRv5 introns, and matches longer than 50 bp with 95 percent identity are reported as 'intron found'. The remaining terms for SeqAnalysis describe the outcome of the evaluation of the CDS detected after pairwise alignment with CDS plus pseudogenes from the TIGRv5 annotation file. The sequences or contigs for which a full CDS with or without STOP codon was detected, a BLASTp against all TIGRv5 protein sequences was performed. Full perfect: 100 percent identity; full good: better than 95 percent identity over more than 95 percent of the sequence; partial good: better than 95 percent identity over less than 95 percent of the sequence; weak similarity: less than 95 percent identity over less than 95 percent of the sequence; no similarity: no hit from BLASTp.