Miyakogusa Predicted Gene
- Lj0g3v0350719.1
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj0g3v0350719.1 Non Chatacterized Hit- tr|K4DBF7|K4DBF7_SOLLC
Uncharacterized protein OS=Solanum lycopersicum
GN=Sol,23.64,4e-16,FAMILY NOT NAMED,NULL; seg,NULL,CUFF.24097.1
(396 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT3G56750.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 508 e-144
AT2G41150.2 | Symbols: | unknown protein; FUNCTIONS IN: molecul... 502 e-142
AT2G41150.1 | Symbols: | unknown protein; FUNCTIONS IN: molecul... 271 4e-73
AT4G12700.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 84 2e-16
AT4G08810.1 | Symbols: SUB1 | calcium ion binding | chr4:5616204... 81 1e-15
AT2G04280.1 | Symbols: | unknown protein; FUNCTIONS IN: molecul... 80 2e-15
>AT3G56750.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT2G41150.2); Has 128 Blast hits to 128 proteins
in 16 species: Archae - 0; Bacteria - 0; Metazoa - 0;
Fungi - 0; Plants - 117; Viruses - 0; Other Eukaryotes -
11 (source: NCBI BLink). | chr3:21018326-21020192
REVERSE LENGTH=403
Length = 403
Score = 508 bits (1309), Expect = e-144, Method: Compositional matrix adjust.
Identities = 246/337 (72%), Positives = 287/337 (85%), Gaps = 9/337 (2%)
Query: 61 SEKYLYWGNRIDCPGKHCGSCEGLGHQESSLRCALEEAIYLRRTFVMPSRMCINPIHNKK 120
SEKYLYWGNRIDCPGK+C +C GLGHQESSLRCALEEA++L RTFVMPS MCINPIHNKK
Sbjct: 67 SEKYLYWGNRIDCPGKNCETCAGLGHQESSLRCALEEAMFLNRTFVMPSGMCINPIHNKK 126
Query: 121 GILHRLANNNATEEEQWATSFCAMDSLYDLKLISQTVPVILENSREWHMLLST---LEDT 177
GIL+R ++N T EE W S CAMDSLYD+ LIS+ +PVIL++S+ WH++LST L +
Sbjct: 127 GILNR--SDNKTTEEGWLGSSCAMDSLYDIDLISEKIPVILDDSKTWHIVLSTSMKLGER 184
Query: 178 QIAHVQRLTRIHLKQDARYSHLLLINRTASPLSWFMECKDRNNRSAIMLPYSFLPSMAAH 237
IAHV +TR LK+ + YS+LL+INRTASPL+WF+ECKDR+NRSA+MLPYSFLP+MAA
Sbjct: 185 GIAHVSGVTRHRLKE-SHYSNLLIINRTASPLAWFVECKDRSNRSAVMLPYSFLPNMAAA 243
Query: 238 KLRDAAQKIKALLGDYDAIHVRRGDKIKTRKDSLGVARTLHPHLDRDTRPEFILCRIAKW 297
KLR+AA+KIKA LGDYDAIHVRRGDK+KTRKD GV R PHLDRDTRPEFIL RI K
Sbjct: 244 KLRNAAEKIKAQLGDYDAIHVRRGDKLKTRKDRFGVERIQFPHLDRDTRPEFILRRIEKR 303
Query: 298 VPPGRTLFIASNERTPGFFSPLSVRYRLAYSSNYSHMLDPIVENNYQLFMIERLIMMGAK 357
+P GRTLFI SNER PGFFSPL+VRY+LAYSSN+S +LDPI+ENNYQLFM+ERL+MMGAK
Sbjct: 304 IPRGRTLFIGSNERKPGFFSPLAVRYKLAYSSNFSEILDPIIENNYQLFMMERLVMMGAK 363
Query: 358 TFIRTFKEDETDLSLTDDPKKNTKKWQIPELVYNADE 394
T+ +TFKE ETDL+LTDDPKKN K W+IP VY DE
Sbjct: 364 TYFKTFKEYETDLTLTDDPKKN-KNWEIP--VYTMDE 397
>AT2G41150.2 | Symbols: | unknown protein; FUNCTIONS IN:
molecular_function unknown; INVOLVED IN:
biological_process unknown; LOCATED IN: endomembrane
system; EXPRESSED IN: leaf; BEST Arabidopsis thaliana
protein match is: unknown protein (TAIR:AT3G56750.1);
Has 127 Blast hits to 127 proteins in 16 species: Archae
- 0; Bacteria - 0; Metazoa - 0; Fungi - 0; Plants - 117;
Viruses - 0; Other Eukaryotes - 10 (source: NCBI BLink).
| chr2:17153851-17155633 FORWARD LENGTH=404
Length = 404
Score = 502 bits (1292), Expect = e-142, Method: Compositional matrix adjust.
Identities = 238/337 (70%), Positives = 281/337 (83%), Gaps = 8/337 (2%)
Query: 61 SEKYLYWGNRIDCPGKHCGSCEGLGHQESSLRCALEEAIYLRRTFVMPSRMCINPIHNKK 120
S+KYLYWGNRIDCPGK+C +C GLGHQESSLRCALEEA++L RTFVMPSRMCINPIHNKK
Sbjct: 67 SDKYLYWGNRIDCPGKNCETCAGLGHQESSLRCALEEAMFLNRTFVMPSRMCINPIHNKK 126
Query: 121 GILHRLANNNATEEEQWATSFCAMDSLYDLKLISQTVPVILENSREWHMLLST---LEDT 177
GIL+R +NN T EE W S CAM+SLYD+ LIS+ +PVIL++S WH++LST L++
Sbjct: 127 GILNR--SNNETREESWEVSSCAMESLYDIDLISEKIPVILDDSETWHIMLSTSMKLKER 184
Query: 178 QIAHVQRLTRIHLKQDARYSHLLLINRTASPLSWFMECKDRNNRSAIMLPYSFLPSMAAH 237
AHV R L + +++LLLINRTASPL+WF+ECKDR NRS +MLPYSFL +MAA
Sbjct: 185 GSAHVYGANRHELNDSSDFTNLLLINRTASPLAWFVECKDRGNRSDVMLPYSFLQTMAAS 244
Query: 238 KLRDAAQKIKALLGDYDAIHVRRGDKIKTRKDSLGVARTLHPHLDRDTRPEFILCRIAKW 297
+LRDAA+KIKA LGDYDAIHVRRGDK+KTRKD V R+ PHLDRDTRPEFI+ RI K
Sbjct: 245 RLRDAAEKIKAKLGDYDAIHVRRGDKLKTRKDRFRVERSQFPHLDRDTRPEFIIGRIQKQ 304
Query: 298 VPPGRTLFIASNERTPGFFSPLSVRYRLAYSSNYSHMLDPIVENNYQLFMIERLIMMGAK 357
+PPGRTLFI SNERTP FFSPL++RY++AYSSN+S +LDPI+ENNYQLFM+ERLIMMGAK
Sbjct: 305 IPPGRTLFIGSNERTPDFFSPLAIRYKVAYSSNFSEILDPIIENNYQLFMVERLIMMGAK 364
Query: 358 TFIRTFKEDETDLSLTDDPKKNTKKWQIPELVYNADE 394
TF +TF+E ETDL+LTDDPKKN K W+IP VY DE
Sbjct: 365 TFFKTFREYETDLTLTDDPKKN-KNWEIP--VYTMDE 398
>AT2G41150.1 | Symbols: | unknown protein; FUNCTIONS IN:
molecular_function unknown; INVOLVED IN:
biological_process unknown; LOCATED IN: endomembrane
system; EXPRESSED IN: leaf; BEST Arabidopsis thaliana
protein match is: unknown protein (TAIR:AT3G56750.1);
Has 57 Blast hits to 57 proteins in 12 species: Archae -
0; Bacteria - 0; Metazoa - 0; Fungi - 0; Plants - 56;
Viruses - 0; Other Eukaryotes - 1 (source: NCBI BLink).
| chr2:17153851-17155019 FORWARD LENGTH=259
Length = 259
Score = 271 bits (694), Expect = 4e-73, Method: Compositional matrix adjust.
Identities = 128/192 (66%), Positives = 156/192 (81%), Gaps = 5/192 (2%)
Query: 61 SEKYLYWGNRIDCPGKHCGSCEGLGHQESSLRCALEEAIYLRRTFVMPSRMCINPIHNKK 120
S+KYLYWGNRIDCPGK+C +C GLGHQESSLRCALEEA++L RTFVMPSRMCINPIHNKK
Sbjct: 67 SDKYLYWGNRIDCPGKNCETCAGLGHQESSLRCALEEAMFLNRTFVMPSRMCINPIHNKK 126
Query: 121 GILHRLANNNATEEEQWATSFCAMDSLYDLKLISQTVPVILENSREWHMLLST---LEDT 177
GIL+R +NN T EE W S CAM+SLYD+ LIS+ +PVIL++S WH++LST L++
Sbjct: 127 GILNR--SNNETREESWEVSSCAMESLYDIDLISEKIPVILDDSETWHIMLSTSMKLKER 184
Query: 178 QIAHVQRLTRIHLKQDARYSHLLLINRTASPLSWFMECKDRNNRSAIMLPYSFLPSMAAH 237
AHV R L + +++LLLINRTASPL+WF+ECKDR NRS +MLPYSFL +MAA
Sbjct: 185 GSAHVYGANRHELNDSSDFTNLLLINRTASPLAWFVECKDRGNRSDVMLPYSFLQTMAAS 244
Query: 238 KLRDAAQKIKAL 249
+LRDAA+K+K L
Sbjct: 245 RLRDAAEKVKEL 256
>AT4G12700.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT2G04280.1); Has 136 Blast hits to 136 proteins
in 17 species: Archae - 0; Bacteria - 0; Metazoa - 0;
Fungi - 0; Plants - 125; Viruses - 0; Other Eukaryotes -
11 (source: NCBI BLink). | chr4:7482643-7484328 REVERSE
LENGTH=561
Length = 561
Score = 83.6 bits (205), Expect = 2e-16, Method: Compositional matrix adjust.
Identities = 72/312 (23%), Positives = 131/312 (41%), Gaps = 60/312 (19%)
Query: 81 CEGLGHQESSLRCALEEAIYLRRTFVMPSRMCINPIHNKKGILHRLANNNATEEEQWATS 140
C+ + H S CAL EA YL RT VM +C++ ++ G EE
Sbjct: 266 CKSMNHFLWSFLCALGEAQYLNRTLVMDLTLCLSSVYTLSG----------QNEEGKDFR 315
Query: 141 FCAMDSLYDLKLISQTVPVILENSREWHMLLSTLEDTQIAHVQRLTRIHLKQDARYSHLL 200
F +D + + + +L+ + W D + + ++HL +D R + +
Sbjct: 316 F-----YFDFEHLKEAAS-MLDQVQFW-------ADWGKWYKKNGLKLHLVEDFRVTPMK 362
Query: 201 LIN----------RTASPLSWFMECKDRNNRSAIMLPYSFLPSMAAHKLRDAAQKIKALL 250
L++ T P +++ + S + P++ L + +L + I + L
Sbjct: 363 LVDVKDTLIMRKFGTVEPDNYWYRVCEGETESVVQRPWNLL--WKSKRLMEIVSAIASRL 420
Query: 251 G-DYDAIHVRRGDKIKTRKDSLGVARTLHPHLDRDTRPEFILCRIAKWVPPGRTLFIASN 309
DYDAIH+ RGDK + ++ + P+L++DT P IL + + GR L+IA+N
Sbjct: 421 NWDYDAIHIERGDKARNKE--------VWPNLEKDTSPSSILSTLQDKIEQGRNLYIATN 472
Query: 310 ERTPGFFSPLSVRYRLAYSSNYSHMLD----------------PIVENNYQLFMIERLIM 353
E FF+PL +Y+ + + + D P+ + Y ++ +
Sbjct: 473 EPELSFFNPLKDKYKPHFLDEFKDLWDESSEWYSETTKLNGGNPVEFDGYMRASVDTEVF 532
Query: 354 MGAKTFIRTFKE 365
+ K I TF +
Sbjct: 533 LRGKKQIETFND 544
>AT4G08810.1 | Symbols: SUB1 | calcium ion binding |
chr4:5616204-5617862 REVERSE LENGTH=552
Length = 552
Score = 80.9 bits (198), Expect = 1e-15, Method: Compositional matrix adjust.
Identities = 81/351 (23%), Positives = 146/351 (41%), Gaps = 63/351 (17%)
Query: 49 GSHSEVNNS--------NGESEKYLYWGNRIDCPGKHCGSCEGLGHQESSLRCALEEAIY 100
G SE+N++ + KYLY+ D C+G+ S C L EA+Y
Sbjct: 227 GGDSEINDTIPTLGSQTSFRRGKYLYYSRGGD-------YCKGMNQYMWSFLCGLGEAMY 279
Query: 101 LRRTFVMPSRMCINPIHNKKGILHRLANNNATEEEQWATSFCAMDSLYDLKLISQTVPVI 160
L RTFVM +C++ ++ KG EE + +D + + +T ++
Sbjct: 280 LNRTFVMDLSLCLSSSYSSKG---------KDEEGK------DFRYYFDFEHLKETASIV 324
Query: 161 -----LENSREWHMLLSTLEDTQIAHVQRLTRIHLKQDARYSHLLLINRTASPLSWFMEC 215
L + ++W+ L + R++ + L +D + W+ C
Sbjct: 325 EEGEFLRDWKKWNRLHKRKVPVRKVKTHRVSPLQLSKDKSTIIWRQFDTPEPENYWYRVC 384
Query: 216 KDRNNRSAIMLPYSFLPSMAAHKLRDAAQKIKALLG-DYDAIHVRRGDKIKTRKDSLGVA 274
+ + ++ + P+ L + +L + +I + D+DA+HV RG+K K +K
Sbjct: 385 EGQASK-YVERPWHAL--WKSKRLMNIVSEISGKMDWDFDAVHVVRGEKAKNKK------ 435
Query: 275 RTLHPHLDRDTRPEFILCRIAKWVPPGRTLFIASNERTPGFFSPLSVRYRLAYSSNYSHM 334
L PHLD DT P+ IL ++ V R L++A+NE +F L +Y++ +YS++
Sbjct: 436 --LWPHLDADTWPDAILTKLKGLVQVWRNLYVATNEPFYNYFDKLRSQYKVHLLDDYSYL 493
Query: 335 L----------------DPIVENNYQLFMIERLIMMGAKTFIRTFKEDETD 369
P+ + Y ++ + KT + TF TD
Sbjct: 494 WGNKSEWYNETSLLNNGKPVEFDGYMRVAVDTEVFYRGKTRVETFYNLTTD 544
>AT2G04280.1 | Symbols: | unknown protein; FUNCTIONS IN:
molecular_function unknown; INVOLVED IN:
biological_process unknown; EXPRESSED IN: 24 plant
structures; EXPRESSED DURING: 13 growth stages; BEST
Arabidopsis thaliana protein match is: unknown protein
(TAIR:AT4G12700.1); Has 130 Blast hits to 130 proteins
in 16 species: Archae - 0; Bacteria - 0; Metazoa - 0;
Fungi - 0; Plants - 124; Viruses - 0; Other Eukaryotes -
6 (source: NCBI BLink). | chr2:1480277-1481983 REVERSE
LENGTH=568
Length = 568
Score = 80.5 bits (197), Expect = 2e-15, Method: Compositional matrix adjust.
Identities = 76/318 (23%), Positives = 132/318 (41%), Gaps = 52/318 (16%)
Query: 81 CEGLGHQESSLRCALEEAIYLRRTFVMPSRMCINPIHNKKGILHRLANNNATEEEQWATS 140
C+ + H S CAL EA YL RT VM +C++ I+ G EE
Sbjct: 271 CKSMNHFLWSFLCALGEAQYLNRTLVMDLTLCLSSIYTSSG----------QNEEGKDFR 320
Query: 141 FCAMDSLYDLKLISQTVPVILENS--REWHMLLSTLEDTQIAHVQRLTRIHLKQDARYSH 198
F +D + + + V+ E +W L + H+ R+ + A
Sbjct: 321 F-----YFDFEHLKEAASVLDEAQFWAQWGKLRKKRRNRLNLHLVEDFRVTPMKLAAVKD 375
Query: 199 LLLINRTAS--PLSWFMECKDRNNRSAIMLPYSFLPSMAAHKLRDAAQKIKALLG-DYDA 255
L++ + S P +++ + + S + P+ L + +L + I + L DYDA
Sbjct: 376 TLIMRKFGSVEPDNYWYRVCEGDAESVVKRPWHLL--WKSRRLMEIVSAIASRLNWDYDA 433
Query: 256 IHVRRGDKIKTRKDSLGVARTLHPHLDRDTRPEFILCRIAKWVPPGRTLFIASNERTPGF 315
+H+ RG+K + ++ + P+L+ DT P +L + V GR L+IA+NE F
Sbjct: 434 VHIERGEKARNKE--------VWPNLEADTSPSALLSTLQDKVEEGRHLYIATNEGELSF 485
Query: 316 FSPLSVRYRLAYSSNYSHMLD----------------PIVENNYQLFMIERLIMMGAKTF 359
F+PL +Y + +Y + D P+ + Y ++ + + K
Sbjct: 486 FNPLKDKYATHFLYDYKDLWDESSEWYSETTKLNGGNPVEFDGYMRASVDTEVFLRGKKQ 545
Query: 360 IRTFKEDETDLSLTDDPK 377
I TF + LT+D K
Sbjct: 546 IETFND------LTNDCK 557