
BLAST2 result
TBLASTN 2.2.2 [Dec-14-2001]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Query= TM0044a.5
(578 letters)
Database: MTGI
36,976 sequences; 27,044,181 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
BF648150 similar to GP|14586969|gb| pol polyprotein {Citrus x pa... 125 3e-37
BG587170 similar to PIR|F86470|F8 probable retroelement polyprot... 108 4e-24
TC93066 weakly similar to GP|19920130|gb|AAM08562.1 Putative ret... 74 2e-13
BG644747 67 2e-11
BG450974 similar to PIR|T05112|T05 splicing factor 9G8-like SR p... 38 0.012
TC81230 36 0.046
TC87868 similar to PIR|T05112|T05112 splicing factor 9G8-like SR... 34 0.14
TC89725 similar to PIR|T05494|T05494 glycine-rich protein T19K4.... 34 0.18
TC81207 similar to GP|21322752|dbj|BAB78536. cold shock protein-... 34 0.18
AJ388976 similar to PIR|E84638|E84 probable RSZp22 splicing fact... 33 0.30
TC84935 similar to PIR|G96631|G96631 probable RNA-binding protei... 32 0.51
CB891135 weakly similar to GP|9758415|dbj contains similarity to... 32 0.88
BG645355 similar to PIR|G96590|G965 hypothetical protein T24C10.... 32 0.88
TC78750 weakly similar to GP|16519227|gb|AAL25130.1 cellulose sy... 32 0.88
TC89153 similar to GP|18855061|gb|AAL79753.1 putative RNA helica... 30 3.3
TC81278 similar to GP|17979004|gb|AAL47462.1 At2g46180/T3F17.17 ... 29 5.7
BG648607 weakly similar to PIR|F96614|F96 probable copia-type po... 28 7.4
TC78961 similar to GP|18252179|gb|AAL61922.1 unknown protein {Ar... 28 9.7
BE325848 weakly similar to GP|8886948|gb F2D10.21 {Arabidopsis t... 28 9.7
TC85193 similar to GP|14423502|gb|AAK62433.1 Unknown protein {Ar... 28 9.7
>BF648150 similar to GP|14586969|gb| pol polyprotein {Citrus x paradisi},
partial (3%)
Length = 658
Score = 125 bits (315), Expect(2) = 3e-37
Identities = 56/154 (36%), Positives = 99/154 (63%)
Frame = +2
Query: 45 EPKDDDSDELKTERKKRREDELLCRGHIMNTLSDRLYDLYTDTQSAAKIWKTLEFKFKAE 104
E KDD++ +R+K D+ +C GHI+N +SD L+D+Y + SA +W LE ++ E
Sbjct: 131 EDKDDETVAETRDRQKWDNDDYICLGHILNGMSDSLFDIYQSSPSAKDLWDKLETRYMRE 310
Query: 105 EEGIKKFLISKYFDFKMLDTKPILQQVHELQVLVNKIKAVKIDIRGAFQVGAIIAKLPPS 164
+ KKFL+S + ++KM+D K +++Q++E++ ++N K +++ V +II KLPPS
Sbjct: 311 DATSKKFLVSHFNNYKMVDNKSVMEQLYEIERILNNYKQHNMNMDETIIVSSIIDKLPPS 490
Query: 165 WNGYRKKLLHNYEDFSLEKIQKHLRIEEESKVRD 198
W +++ + H ED SLE++ HLR+ EE + ++
Sbjct: 491 WKDFKRTMKHKKEDISLEQLGNHLRLXEEYRKQE 592
Score = 48.1 bits (113), Expect(2) = 3e-37
Identities = 21/34 (61%), Positives = 25/34 (72%)
Frame = +1
Query: 1 MNQDLVKLDRFDGTNFTRWQDKMIFLLTALKIYY 34
M D VKL++F+G NF RWQ KM FLLT LK+ Y
Sbjct: 7 MTSDFVKLEKFNGGNFIRWQKKMKFLLTTLKVAY 108
>BG587170 similar to PIR|F86470|F8 probable retroelement polyprotein
[imported] - Arabidopsis thaliana, partial (13%)
Length = 718
Score = 108 bits (271), Expect = 4e-24
Identities = 71/199 (35%), Positives = 107/199 (53%), Gaps = 1/199 (0%)
Frame = -3
Query: 380 VYESEKLILTRNGVFVGKGYSAEGMIKLCPIDNIINKVSNSAYMIDSVSLWHSRLAHIGI 439
V ES +LI GV G Y E KL P+ N ++S+ ++ +LWH+RL H
Sbjct: 713 VIESSQLI--GEGVTKGDLYMLE---KLDPVSNYKCSFTSSS-SLNKDALWHARLGHPHG 552
Query: 440 STMNRLIKSKLISCNIHEFEKCEICVKSKMIKKHFKSVERKF-NLLDLVHSDLCEFNGML 498
+N ++ + E + CE C+ K K F + N DL+++DL L
Sbjct: 551 RALNLMLPGV-----VFENKNCEACILGKHCKNVFPRTSTVYENCFDLIYTDLWTAPS-L 390
Query: 499 TRGGNRYFITFIDDCSRYTHVYLLKHKDDAFNAFKSYKAEVENQLNKTIKVLRSDRGGEY 558
+R ++YF+TFID+ S+YT + L+ KD +AFK+++A V N + IK+LRSD GGEY
Sbjct: 389 SRDNHKYFVTFIDEKSKYTWLTLIPSKDRVIDAFKNFQAYVTNHYHAKIKILRSDNGGEY 210
Query: 559 FSTKFDSFCEEHDIIHECS 577
S F S + H I+H+ S
Sbjct: 209 TSYAFKSHLDHHGILHQTS 153
>TC93066 weakly similar to GP|19920130|gb|AAM08562.1 Putative retroelement
{Oryza sativa} [Oryza sativa (japonica cultivar-group)],
partial (10%)
Length = 823
Score = 73.9 bits (180), Expect = 2e-13
Identities = 37/97 (38%), Positives = 55/97 (56%)
Frame = +1
Query: 476 SVERKFNLLDLVHSDLCEFNGMLTRGGNRYFITFIDDCSRYTHVYLLKHKDDAFNAFKSY 535
+ R +LD +HSDL + + + GG RY +T IDD R VY L++K++ F FK +
Sbjct: 73 ATHRTKGILDYIHSDLWGPSKVTSYGGRRYMMTIIDDFPRKVWVYFLRYKNETFPTFKKW 252
Query: 536 KAEVENQLNKTIKVLRSDRGGEYFSTKFDSFCEEHDI 572
+ VE Q K +K L +D E+ S+ F+ FC H I
Sbjct: 253 RILVETQTGKNVKKLITDN*LEFCSSDFNEFCTNHGI 363
>BG644747
Length = 685
Score = 67.0 bits (162), Expect = 2e-11
Identities = 37/108 (34%), Positives = 58/108 (53%), Gaps = 1/108 (0%)
Frame = +1
Query: 64 DELLCRGHIMNTLSDRLYDLYTDTQSAAK-IWKTLEFKFKAEEEGIKKFLISKYFDFKML 122
D CR HI L D YD Y T S++K IWK L+ + E+ K+ S +F FKM+
Sbjct: 238 DSYKCRYHIFKCLYDNFYDYYDRTYSSSKKIWKALQSMYDIEDARA*KYTDS*FFRFKMV 417
Query: 123 DTKPILQQVHELQVLVNKIKAVKIDIRGAFQVGAIIAKLPPSWNGYRK 170
D K ++ Q + ++V +++ ++ I V I+ KLPPS ++K
Sbjct: 418 DNKSMVDQAQDFIMIVRYLRSKEVKIGDNLIVCGIVDKLPPS*KKFQK 561
>BG450974 similar to PIR|T05112|T05 splicing factor 9G8-like SR protein
RSZp22 [validated] - Arabidopsis thaliana, partial (54%)
Length = 364
Score = 37.7 bits (86), Expect = 0.012
Identities = 19/50 (38%), Positives = 24/50 (48%), Gaps = 7/50 (14%)
Frame = +1
Query: 220 DGKQQHLGPKKEHNKFKNNNGTKGPKGG-------CYVCGKPDHFARDCR 262
DGK ++K G G +GG CY CG+P HFAR+CR
Sbjct: 214 DGKNGWRVQLSHNSKSGGGGGRGGGRGGRGGDDLKCYECGEPGHFARECR 363
>TC81230
Length = 958
Score = 35.8 bits (81), Expect = 0.046
Identities = 23/142 (16%), Positives = 65/142 (45%), Gaps = 2/142 (1%)
Frame = +1
Query: 12 DGTNFTRWQDKMIFLLTALKIYYVLDPDLTPIAEPKDDDSDEL--KTERKKRREDELLCR 69
+G+N+ W + M L +++ + D + KDD +D K E + +++
Sbjct: 241 NGSNYNHWAESMCGFLKGRRLWRYVTGDKKCPTKGKDDTADAFADKLEEWDSKNHQIIT- 417
Query: 70 GHIMNTLSDRLYDLYTDTQSAAKIWKTLEFKFKAEEEGIKKFLISKYFDFKMLDTKPILQ 129
NT ++ + ++A ++W L+ ++ + + L+ + K +P+ +
Sbjct: 418 -WFRNTSIPSIHMQFGRFENAKEVWDHLKQRYTISDLSHQYQLLKDLSNLKQQSGQPVYE 594
Query: 130 QVHELQVLVNKIKAVKIDIRGA 151
+ +++V+ N++ + + ++ A
Sbjct: 595 FLAQMEVIWNQLTSCEPSLKDA 660
>TC87868 similar to PIR|T05112|T05112 splicing factor 9G8-like SR protein
RSZp22 [validated] - Arabidopsis thaliana, partial (91%)
Length = 860
Score = 34.3 bits (77), Expect = 0.14
Identities = 13/27 (48%), Positives = 16/27 (59%)
Frame = +3
Query: 236 KNNNGTKGPKGGCYVCGKPDHFARDCR 262
+ G G CY CG+P HFAR+CR
Sbjct: 300 RGGGGGGGSDLKCYECGEPGHFARECR 380
>TC89725 similar to PIR|T05494|T05494 glycine-rich protein T19K4.150 -
Arabidopsis thaliana, partial (17%)
Length = 378
Score = 33.9 bits (76), Expect = 0.18
Identities = 13/22 (59%), Positives = 14/22 (63%)
Frame = +1
Query: 240 GTKGPKGGCYVCGKPDHFARDC 261
G G G CY CG+ HFARDC
Sbjct: 112 GGGGGGGSCYSCGESGHFARDC 177
Score = 31.2 bits (69), Expect = 1.1
Identities = 11/19 (57%), Positives = 13/19 (67%)
Frame = +1
Query: 243 GPKGGCYVCGKPDHFARDC 261
G GGCY CG+ H AR+C
Sbjct: 1 GVGGGCYNCGESGHMAREC 57
>TC81207 similar to GP|21322752|dbj|BAB78536. cold shock protein-1 {Triticum
aestivum}, partial (39%)
Length = 630
Score = 33.9 bits (76), Expect = 0.18
Identities = 20/59 (33%), Positives = 26/59 (43%)
Frame = +3
Query: 203 SGFSKANTVTTKGKKKYDGKQQHLGPKKEHNKFKNNNGTKGPKGGCYVCGKPDHFARDC 261
+G +KA VT + +Q + G F+ G GGCY CG H ARDC
Sbjct: 219 NGKTKAVDVTGPKGEPLQVRQDNHGGGGGGRGFRGGERRNGG-GGCYTCGDTGHIARDC 392
Score = 33.1 bits (74), Expect = 0.30
Identities = 16/28 (57%), Positives = 16/28 (57%), Gaps = 3/28 (10%)
Frame = +3
Query: 237 NNNGTKGPKGG---CYVCGKPDHFARDC 261
NNNG G GG CY CG H ARDC
Sbjct: 522 NNNGGGGYGGGGTSCYRCGGVGHIARDC 605
Score = 33.1 bits (74), Expect = 0.30
Identities = 16/47 (34%), Positives = 21/47 (44%), Gaps = 6/47 (12%)
Frame = +3
Query: 221 GKQQHLGPKKEHNKFKNNNGTKGPKGG------CYVCGKPDHFARDC 261
G H+ + + + N G GG CY CG +HFARDC
Sbjct: 363 GDTGHIARDCDRSDRNDRNDRSGGGGGGDRDRACYTCGSFEHFARDC 503
>AJ388976 similar to PIR|E84638|E84 probable RSZp22 splicing factor
[imported] - Arabidopsis thaliana, partial (62%)
Length = 508
Score = 33.1 bits (74), Expect = 0.30
Identities = 14/31 (45%), Positives = 17/31 (54%), Gaps = 6/31 (19%)
Frame = +2
Query: 240 GTKGPKGG------CYVCGKPDHFARDCRQN 264
G +G GG CY CG+P HFAR C +
Sbjct: 302 GGRGRSGGGGSDLKCYXCGEPGHFARXCNSS 394
>TC84935 similar to PIR|G96631|G96631 probable RNA-binding protein F8A5.17
[imported] - Arabidopsis thaliana, partial (41%)
Length = 552
Score = 32.3 bits (72), Expect = 0.51
Identities = 13/42 (30%), Positives = 21/42 (49%)
Frame = +2
Query: 220 DGKQQHLGPKKEHNKFKNNNGTKGPKGGCYVCGKPDHFARDC 261
D Q++ G + G + + C+ CG+P H+ARDC
Sbjct: 389 DADQRYRGGFSSGGRGSYGAGDRVGQDDCFKCGRPGHWARDC 514
>CB891135 weakly similar to GP|9758415|dbj contains similarity to unknown
protein~gene_id:MHF15.13~pir||T01776 {Arabidopsis
thaliana}, partial (22%)
Length = 827
Score = 31.6 bits (70), Expect = 0.88
Identities = 15/34 (44%), Positives = 22/34 (64%)
Frame = +1
Query: 162 PPSWNGYRKKLLHNYEDFSLEKIQKHLRIEEESK 195
PP ++ Y +K+ NYED L K Q HL+ +E+ K
Sbjct: 502 PPCFSSYAEKIFQNYEDI-LRKNQYHLQDKEKLK 600
>BG645355 similar to PIR|G96590|G965 hypothetical protein T24C10.5 [imported]
- Arabidopsis thaliana, partial (5%)
Length = 627
Score = 31.6 bits (70), Expect = 0.88
Identities = 10/24 (41%), Positives = 15/24 (61%)
Frame = -1
Query: 238 NNGTKGPKGGCYVCGKPDHFARDC 261
+ G+ G G CY C +P H+A +C
Sbjct: 432 SGGSGGASGNCYKCNQPGHWANNC 361
Score = 30.0 bits (66), Expect = 2.6
Identities = 16/66 (24%), Positives = 27/66 (40%)
Frame = -1
Query: 239 NGTKGPKGGCYVCGKPDHFARDCRQNKTKKEVNAVQVDDEIIATVSEVMAVKGKVPGWWY 298
+G+ G G CY C +P H+A +C V+ + + K PG W
Sbjct: 528 SGSGGASGKCYKCQQPGHWASNCPSMSAANRVSGG-------SGGASGNCYKCNQPGHWA 370
Query: 299 DTCASV 304
+ C ++
Sbjct: 369 NNCPNM 352
>TC78750 weakly similar to GP|16519227|gb|AAL25130.1 cellulose synthase-like
protein OsCslE2 {Oryza sativa}, partial (6%)
Length = 1044
Score = 31.6 bits (70), Expect = 0.88
Identities = 17/78 (21%), Positives = 35/78 (44%), Gaps = 3/78 (3%)
Frame = +3
Query: 426 SVSLWHSRLAHIGISTMNRLIKSKLISCNIHEFEKCE---ICVKSKMIKKHFKSVERKFN 482
S+ LW ++A+ + + L+K I C+IH C + ++ I K ++++
Sbjct: 279 SIHLWDFKIAYCSLFNLLSLVKFSPICCSIHPIWNCSSTLLLARNTCISKGYRALVYSVC 458
Query: 483 LLDLVHSDLCEFNGMLTR 500
+ + SD + G R
Sbjct: 459 SIVCIQSDSTFYRGNYNR 512
>TC89153 similar to GP|18855061|gb|AAL79753.1 putative RNA helicase {Oryza
sativa}, partial (3%)
Length = 737
Score = 29.6 bits (65), Expect = 3.3
Identities = 12/25 (48%), Positives = 15/25 (60%)
Frame = +3
Query: 237 NNNGTKGPKGGCYVCGKPDHFARDC 261
N G+ G G C+ CG+P H A DC
Sbjct: 120 NRRGSYG--GACFSCGQPGHRASDC 188
>TC81278 similar to GP|17979004|gb|AAL47462.1 At2g46180/T3F17.17
{Arabidopsis thaliana}, partial (10%)
Length = 857
Score = 28.9 bits (63), Expect = 5.7
Identities = 22/75 (29%), Positives = 31/75 (41%), Gaps = 3/75 (4%)
Frame = +1
Query: 170 KKLLHNYEDFSLEKIQKHLRIEEES---KVRDKAESSGFSKANTVTTKGKKKYDGKQQHL 226
K L NY EK +R+ +E+ K +A S S N KG +QH
Sbjct: 424 KALSVNYAALLKEKEDHIIRLNKENGSLKQNLEATSPASSNGNH-RVKGSSDQSSNRQHR 600
Query: 227 GPKKEHNKFKNNNGT 241
+ N++ NNGT
Sbjct: 601 SATQMKNRYTTNNGT 645
>BG648607 weakly similar to PIR|F96614|F96 probable copia-type polyprotein
T18I24.5 [imported] - Arabidopsis thaliana, partial (2%)
Length = 563
Score = 28.5 bits (62), Expect = 7.4
Identities = 17/48 (35%), Positives = 27/48 (55%), Gaps = 4/48 (8%)
Frame = +2
Query: 235 FKNNNGTKGP--KGGCYVCGKPDHFARDCRQNKTK--KEVNAVQVDDE 278
F +NN KG + CY C K ++A CR +K++ KE+ ++ DE
Sbjct: 371 FTSNNYAKGTMIQIQCYNCRKFSYYALKCRFSKSRVEKEIQYMKEKDE 514
>TC78961 similar to GP|18252179|gb|AAL61922.1 unknown protein {Arabidopsis
thaliana}, partial (71%)
Length = 974
Score = 28.1 bits (61), Expect = 9.7
Identities = 12/31 (38%), Positives = 17/31 (54%), Gaps = 3/31 (9%)
Frame = +2
Query: 242 KGP---KGGCYVCGKPDHFARDCRQNKTKKE 269
+GP G C+ CG H+ARDC+ K +
Sbjct: 332 RGPPPGSGRCFNCGIDGHWARDCKAGDWKNK 424
>BE325848 weakly similar to GP|8886948|gb F2D10.21 {Arabidopsis thaliana},
partial (8%)
Length = 449
Score = 28.1 bits (61), Expect = 9.7
Identities = 21/85 (24%), Positives = 36/85 (41%), Gaps = 8/85 (9%)
Frame = +3
Query: 349 KKVTLVNVLHVPEMSRDLV---SGDLLGKPGIKSVYESEKLILTRNGV-----FVGKGYS 400
K++ V+ PE +R + + L K G V++ E L+ V + + S
Sbjct: 132 KQLLKDKVIGCPEFNRSVQKVKAHPSLQKGGCNEVHDIEDLVKVGQSVKGCSYYAARSMS 311
Query: 401 AEGMIKLCPIDNIINKVSNSAYMID 425
+ + CP + IIN V A +D
Sbjct: 312 NDAQLVFCPYNYIINPVIRGAMEVD 386
>TC85193 similar to GP|14423502|gb|AAK62433.1 Unknown protein {Arabidopsis
thaliana}, partial (73%)
Length = 1351
Score = 28.1 bits (61), Expect = 9.7
Identities = 16/38 (42%), Positives = 20/38 (52%)
Frame = -1
Query: 500 RGGNRYFITFIDDCSRYTHVYLLKHKDDAFNAFKSYKA 537
RG F+ ID C+ Y H+ LL D FNA SY +
Sbjct: 1240 RGVENNFM*VIDKCNLYNHISLLIQFYD-FNAHSSYSS 1130
Database: MTGI
Posted date: Oct 22, 2004 3:39 PM
Number of letters in database: 27,044,181
Number of sequences in database: 36,976
Lambda K H
0.318 0.135 0.397
Gapped
Lambda K H
0.267 0.0410 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Hits to DB: 16,949,716
Number of Sequences: 36976
Number of extensions: 224102
Number of successful extensions: 1305
Number of sequences better than 10.0: 42
Number of HSP's better than 10.0 without gapping: 1292
Number of HSP's successfully gapped in prelim test: 0
Number of HSP's that attempted gapping in prelim test: 0
Number of HSP's gapped (non-prelim): 1303
length of query: 578
length of database: 9,014,727
effective HSP length: 101
effective length of query: 477
effective length of database: 5,280,151
effective search space: 2518632027
effective search space used: 2518632027
frameshift window, decay const: 50, 0.1
T: 13
A: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 61 (28.1 bits)
Lotus: description of TM0044a.5