Miyakogusa Predicted Gene
- Lj2g3v1172540.1
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj2g3v1172540.1 Non Chatacterized Hit- tr|I1J6R6|I1J6R6_SOYBN
Uncharacterized protein OS=Glycine max GN=Gma.31297
PE,81.59,0,coiled-coil,NULL; SUBFAMILY NOT NAMED,NULL; FAMILY NOT
NAMED,NULL; seg,NULL,CUFF.36381.1
(524 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT3G25690.2 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ... 469 e-132
AT3G25690.1 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ... 469 e-132
AT3G25690.3 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ... 469 e-132
AT4G18570.1 | Symbols: | Tetratricopeptide repeat (TPR)-like su... 312 4e-85
AT1G48280.1 | Symbols: | hydroxyproline-rich glycoprotein famil... 272 4e-73
AT1G07120.1 | Symbols: | FUNCTIONS IN: molecular_function unkno... 241 1e-63
AT4G04980.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 61 2e-09
>AT3G25690.2 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
family protein | chr3:9354061-9357757 FORWARD
LENGTH=1004
Length = 1004
Score = 469 bits (1207), Expect = e-132, Method: Compositional matrix adjust.
Identities = 224/269 (83%), Positives = 252/269 (93%), Gaps = 1/269 (0%)
Query: 237 DKVHRAPELVEFYQSLMKREAKKDTSTLLISS-TSNASDARSNMIGEIENRSTFLLAVKA 295
+KVHRAPELVEFYQSLMKRE+KK+ + LISS T N+S AR+NMIGEIENRSTFLLAVKA
Sbjct: 716 NKVHRAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKA 775
Query: 296 DVETQGDFVTSLATEVRAATFSNIDDLVAFVNWLDEELSFLVDERAVLKHFDWPEGKADA 355
DVETQGDFV SLATEVRA++F++I+DL+AFV+WLDEELSFLVDERAVLKHFDWPEGKADA
Sbjct: 776 DVETQGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHFDWPEGKADA 835
Query: 356 LREAAFEYQDLMKLEKRVSTFADDPKLSCDAALKKMYSLLEKVEQSVYALLRTRDLAISR 415
LREAAFEYQDLMKLEK+V++F DDP LSC+ ALKKMY LLEKVEQSVYALLRTRD+AISR
Sbjct: 836 LREAAFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKLLEKVEQSVYALLRTRDMAISR 895
Query: 416 YKEFGIPVNWLLDSGVVGKIKLSSVQLAKKYMKRVTSELDALSGSEKEPAREFLTLQGVR 475
YKEFGIPV+WL D+GVVGKIKLSSVQLAKKYMKRV ELD++SGS+K+P REFL LQGVR
Sbjct: 896 YKEFGIPVDWLSDTGVVGKIKLSSVQLAKKYMKRVAYELDSVSGSDKDPNREFLLLQGVR 955
Query: 476 FAFRVHQFAGGFDADSMKAFEDLRNRIQT 504
FAFRVHQFAGGFDA+SMKAFE+LR+R +T
Sbjct: 956 FAFRVHQFAGGFDAESMKAFEELRSRAKT 984
Score = 149 bits (375), Expect = 6e-36, Method: Compositional matrix adjust.
Identities = 84/177 (47%), Positives = 107/177 (60%), Gaps = 28/177 (15%)
Query: 6 KPRGPLESLMIRNAGDGVAITTFGQRDQETTDSPEIPTTPNLRRAPXXXX----XXXXXX 61
K RGPLESLMIRNAG+ VAITTFGQ DQE+ +PE P P +R
Sbjct: 482 KQRGPLESLMIRNAGESVAITTFGQVDQESPGTPETPNLPRIRTQQQASSPGEGLNSVAA 541
Query: 62 XFHLMSKSIDGSLDEKYPAYKDRHKLALSREKQLKEKAEKARVQKFGDSSNLSISKAERD 121
FH+MSKS+D LDEKYPAYKDRHKLA+ REK +K KA++AR ++FG +
Sbjct: 542 SFHVMSKSVDNVLDEKYPAYKDRHKLAVEREKHIKHKADQARAERFGGN----------- 590
Query: 122 RPISLPPKLTLIKEK----PLVSASANDQ------SDDGK-NVDTQNISKMKLADIE 167
++LPPKL +KEK P V + DQ S++GK + + ++KMKL DIE
Sbjct: 591 --VALPPKLAQLKEKRVVVPSVITATGDQSNESNESNEGKASENAATVTKMKLVDIE 645
>AT3G25690.1 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
family protein | chr3:9354061-9357757 FORWARD
LENGTH=1004
Length = 1004
Score = 469 bits (1207), Expect = e-132, Method: Compositional matrix adjust.
Identities = 224/269 (83%), Positives = 252/269 (93%), Gaps = 1/269 (0%)
Query: 237 DKVHRAPELVEFYQSLMKREAKKDTSTLLISS-TSNASDARSNMIGEIENRSTFLLAVKA 295
+KVHRAPELVEFYQSLMKRE+KK+ + LISS T N+S AR+NMIGEIENRSTFLLAVKA
Sbjct: 716 NKVHRAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKA 775
Query: 296 DVETQGDFVTSLATEVRAATFSNIDDLVAFVNWLDEELSFLVDERAVLKHFDWPEGKADA 355
DVETQGDFV SLATEVRA++F++I+DL+AFV+WLDEELSFLVDERAVLKHFDWPEGKADA
Sbjct: 776 DVETQGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHFDWPEGKADA 835
Query: 356 LREAAFEYQDLMKLEKRVSTFADDPKLSCDAALKKMYSLLEKVEQSVYALLRTRDLAISR 415
LREAAFEYQDLMKLEK+V++F DDP LSC+ ALKKMY LLEKVEQSVYALLRTRD+AISR
Sbjct: 836 LREAAFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKLLEKVEQSVYALLRTRDMAISR 895
Query: 416 YKEFGIPVNWLLDSGVVGKIKLSSVQLAKKYMKRVTSELDALSGSEKEPAREFLTLQGVR 475
YKEFGIPV+WL D+GVVGKIKLSSVQLAKKYMKRV ELD++SGS+K+P REFL LQGVR
Sbjct: 896 YKEFGIPVDWLSDTGVVGKIKLSSVQLAKKYMKRVAYELDSVSGSDKDPNREFLLLQGVR 955
Query: 476 FAFRVHQFAGGFDADSMKAFEDLRNRIQT 504
FAFRVHQFAGGFDA+SMKAFE+LR+R +T
Sbjct: 956 FAFRVHQFAGGFDAESMKAFEELRSRAKT 984
Score = 149 bits (375), Expect = 6e-36, Method: Compositional matrix adjust.
Identities = 84/177 (47%), Positives = 107/177 (60%), Gaps = 28/177 (15%)
Query: 6 KPRGPLESLMIRNAGDGVAITTFGQRDQETTDSPEIPTTPNLRRAPXXXX----XXXXXX 61
K RGPLESLMIRNAG+ VAITTFGQ DQE+ +PE P P +R
Sbjct: 482 KQRGPLESLMIRNAGESVAITTFGQVDQESPGTPETPNLPRIRTQQQASSPGEGLNSVAA 541
Query: 62 XFHLMSKSIDGSLDEKYPAYKDRHKLALSREKQLKEKAEKARVQKFGDSSNLSISKAERD 121
FH+MSKS+D LDEKYPAYKDRHKLA+ REK +K KA++AR ++FG +
Sbjct: 542 SFHVMSKSVDNVLDEKYPAYKDRHKLAVEREKHIKHKADQARAERFGGN----------- 590
Query: 122 RPISLPPKLTLIKEK----PLVSASANDQ------SDDGK-NVDTQNISKMKLADIE 167
++LPPKL +KEK P V + DQ S++GK + + ++KMKL DIE
Sbjct: 591 --VALPPKLAQLKEKRVVVPSVITATGDQSNESNESNEGKASENAATVTKMKLVDIE 645
>AT3G25690.3 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
family protein | chr3:9354061-9357757 FORWARD LENGTH=863
Length = 863
Score = 469 bits (1207), Expect = e-132, Method: Compositional matrix adjust.
Identities = 224/269 (83%), Positives = 252/269 (93%), Gaps = 1/269 (0%)
Query: 237 DKVHRAPELVEFYQSLMKREAKKDTSTLLISS-TSNASDARSNMIGEIENRSTFLLAVKA 295
+KVHRAPELVEFYQSLMKRE+KK+ + LISS T N+S AR+NMIGEIENRSTFLLAVKA
Sbjct: 575 NKVHRAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKA 634
Query: 296 DVETQGDFVTSLATEVRAATFSNIDDLVAFVNWLDEELSFLVDERAVLKHFDWPEGKADA 355
DVETQGDFV SLATEVRA++F++I+DL+AFV+WLDEELSFLVDERAVLKHFDWPEGKADA
Sbjct: 635 DVETQGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHFDWPEGKADA 694
Query: 356 LREAAFEYQDLMKLEKRVSTFADDPKLSCDAALKKMYSLLEKVEQSVYALLRTRDLAISR 415
LREAAFEYQDLMKLEK+V++F DDP LSC+ ALKKMY LLEKVEQSVYALLRTRD+AISR
Sbjct: 695 LREAAFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKLLEKVEQSVYALLRTRDMAISR 754
Query: 416 YKEFGIPVNWLLDSGVVGKIKLSSVQLAKKYMKRVTSELDALSGSEKEPAREFLTLQGVR 475
YKEFGIPV+WL D+GVVGKIKLSSVQLAKKYMKRV ELD++SGS+K+P REFL LQGVR
Sbjct: 755 YKEFGIPVDWLSDTGVVGKIKLSSVQLAKKYMKRVAYELDSVSGSDKDPNREFLLLQGVR 814
Query: 476 FAFRVHQFAGGFDADSMKAFEDLRNRIQT 504
FAFRVHQFAGGFDA+SMKAFE+LR+R +T
Sbjct: 815 FAFRVHQFAGGFDAESMKAFEELRSRAKT 843
Score = 149 bits (375), Expect = 6e-36, Method: Compositional matrix adjust.
Identities = 84/177 (47%), Positives = 107/177 (60%), Gaps = 28/177 (15%)
Query: 6 KPRGPLESLMIRNAGDGVAITTFGQRDQETTDSPEIPTTPNLRRAPXXXX----XXXXXX 61
K RGPLESLMIRNAG+ VAITTFGQ DQE+ +PE P P +R
Sbjct: 341 KQRGPLESLMIRNAGESVAITTFGQVDQESPGTPETPNLPRIRTQQQASSPGEGLNSVAA 400
Query: 62 XFHLMSKSIDGSLDEKYPAYKDRHKLALSREKQLKEKAEKARVQKFGDSSNLSISKAERD 121
FH+MSKS+D LDEKYPAYKDRHKLA+ REK +K KA++AR ++FG +
Sbjct: 401 SFHVMSKSVDNVLDEKYPAYKDRHKLAVEREKHIKHKADQARAERFGGN----------- 449
Query: 122 RPISLPPKLTLIKEK----PLVSASANDQ------SDDGK-NVDTQNISKMKLADIE 167
++LPPKL +KEK P V + DQ S++GK + + ++KMKL DIE
Sbjct: 450 --VALPPKLAQLKEKRVVVPSVITATGDQSNESNESNEGKASENAATVTKMKLVDIE 504
>AT4G18570.1 | Symbols: | Tetratricopeptide repeat (TPR)-like
superfamily protein | chr4:10231439-10234534 FORWARD
LENGTH=642
Length = 642
Score = 312 bits (799), Expect = 4e-85, Method: Compositional matrix adjust.
Identities = 159/280 (56%), Positives = 208/280 (74%), Gaps = 15/280 (5%)
Query: 234 MAGDKVHRAPELVEFYQSLMKREA---KKDTS------TLLISSTSNASDARSNMIGEIE 284
+A KV R PE+VEFY SLM+R++ ++D++ I + SNA D MIGEIE
Sbjct: 348 IASAKVRRVPEVVEFYHSLMRRDSTNSRRDSTGGGNAAAEAILANSNARD----MIGEIE 403
Query: 285 NRSTFLLAVKADVETQGDFVTSLATEVRAATFSNIDDLVAFVNWLDEELSFLVDERAVLK 344
NRS +LLA+K DVETQGDF+ L EV A FS+I+D+V FV WLD+ELS+LVDERAVLK
Sbjct: 404 NRSVYLLAIKTDVETQGDFIRFLIKEVGNAAFSDIEDVVPFVKWLDDELSYLVDERAVLK 463
Query: 345 HFDWPEGKADALREAAFEYQDLMKLEKRVSTFADDPKLSCDAALKKMYSLLEKVEQSVYA 404
HF+WPE KADALREAAF Y DL KL S F +DP+ S +ALKKM +L EK+E VY+
Sbjct: 464 HFEWPEQKADALREAAFCYFDLKKLISEASRFREDPRQSSSSALKKMQALFEKLEHGVYS 523
Query: 405 LLRTRDLAISRYKEFGIPVNWLLDSGVVGKIKLSSVQLAKKYMKRVTSELDALSGSEKEP 464
L R R+ A +++K F IPV+W+L++G+ +IKL+SV+LA KYMKRV++EL+A+ G E
Sbjct: 524 LSRMRESAATKFKSFQIPVDWMLETGITSQIKLASVKLAMKYMKRVSAELEAIEGGGPEE 583
Query: 465 AREFLTLQGVRFAFRVHQFAGGFDADSMKAFEDLRNRIQT 504
L +QGVRFAFRVHQFAGGFDA++MKAFE+LR++ ++
Sbjct: 584 EE--LIVQGVRFAFRVHQFAGGFDAETMKAFEELRDKARS 621
>AT1G48280.1 | Symbols: | hydroxyproline-rich glycoprotein family
protein | chr1:17835196-17837553 FORWARD LENGTH=558
Length = 558
Score = 272 bits (696), Expect = 4e-73, Method: Compositional matrix adjust.
Identities = 122/265 (46%), Positives = 190/265 (71%)
Query: 238 KVHRAPELVEFYQSLMKREAKKDTSTLLISSTSNASDARSNMIGEIENRSTFLLAVKADV 297
+ ++P + + +Q L K++ ++ S + + S + A ++++GEI+NRS L+A+KAD+
Sbjct: 278 RAQKSPPVSQLFQLLNKQDNSRNLSQSVNGNKSQVNSAHNSIVGEIQNRSAHLIAIKADI 337
Query: 298 ETQGDFVTSLATEVRAATFSNIDDLVAFVNWLDEELSFLVDERAVLKHFDWPEGKADALR 357
ET+G+F+ L +V FS+++D++ FV+WLD+EL+ L DERAVLKHF WPE KAD L+
Sbjct: 338 ETKGEFINDLIQKVLTTCFSDMEDVMKFVDWLDKELATLADERAVLKHFKWPEKKADTLQ 397
Query: 358 EAAFEYQDLMKLEKRVSTFADDPKLSCDAALKKMYSLLEKVEQSVYALLRTRDLAISRYK 417
EAA EY++L KLEK +S+++DDP + ALKKM +LL+K EQ + L+R R ++ Y+
Sbjct: 398 EAAVEYRELKKLEKELSSYSDDPNIHYGVALKKMANLLDKSEQRIRRLVRLRGSSMRSYQ 457
Query: 418 EFGIPVNWLLDSGVVGKIKLSSVQLAKKYMKRVTSELDALSGSEKEPAREFLTLQGVRFA 477
+F IPV W+LDSG++ KIK +S++LAK YM RV +EL + ++E +E L LQGVRFA
Sbjct: 458 DFKIPVEWMLDSGMICKIKRASIKLAKTYMNRVANELQSARNLDRESTKEALLLQGVRFA 517
Query: 478 FRVHQFAGGFDADSMKAFEDLRNRI 502
+R HQFAGG D +++ A E+++ R+
Sbjct: 518 YRTHQFAGGLDPETLCALEEIKQRV 542
>AT1G07120.1 | Symbols: | FUNCTIONS IN: molecular_function unknown;
INVOLVED IN: biological_process unknown; LOCATED IN:
chloroplast envelope; EXPRESSED IN: inflorescence
meristem, petal, leaf whorl, flower; EXPRESSED DURING: 4
anthesis, petal differentiation and expansion stage;
BEST Arabidopsis thaliana protein match is:
Tetratricopeptide repeat (TPR)-like superfamily protein
(TAIR:AT4G18570.1); Has 288 Blast hits to 260 proteins
in 50 species: Archae - 0; Bacteria - 8; Metazoa - 27;
Fungi - 15; Plants - 163; Viruses - 0; Other Eukaryotes
- 75 (source: NCBI BLink). | chr1:2184874-2186580
REVERSE LENGTH=392
Length = 392
Score = 241 bits (615), Expect = 1e-63, Method: Compositional matrix adjust.
Identities = 126/263 (47%), Positives = 178/263 (67%), Gaps = 9/263 (3%)
Query: 239 VHRAPELVEFYQSLMKREAKKDTSTLLISSTSNASDA-RSNMIGEIENRSTFLLAVKADV 297
V RAPE+VEFY++L KRE+ I+ S A NMIGEIENRS +L +K+D
Sbjct: 128 VRRAPEVVEFYRALTKRESHMGNK---INQNGVLSPAFNRNMIGEIENRSKYLSDIKSDT 184
Query: 298 ETQGDFVTSLATEVRAATFSNIDDLVAFVNWLDEELSFLVDERAVLKHF-DWPEGKADAL 356
+ D + L ++V AATF++I ++ FV W+DEELS LVDERAVLKHF WPE K D+L
Sbjct: 185 DRHRDHIHILISKVEAATFTDISEVETFVKWIDEELSSLVDERAVLKHFPKWPERKVDSL 244
Query: 357 REAAFEYQDLMKLEKRVSTFADDPKLSCDAALKKMYSLLEKVEQSVYALLRTRDLAISRY 416
REAA Y+ L + +F D+PK S AL+++ SL +++E+SV + RD RY
Sbjct: 245 REAACNYKRPKNLGNEILSFKDNPKDSLTQALQRIQSLQDRLEESVNNTEKMRDSTGKRY 304
Query: 417 KEFGIPVNWLLDSGVVGKIKLSSVQLAKKYMKRVTSELDALSGSEKEPAREFLTLQGVRF 476
K+F IP W+LD+G++G++K SS++LA++YMKR+ EL++ +GS KE L LQGVRF
Sbjct: 305 KDFQIPWEWMLDTGLIGQLKYSSLRLAQEYMKRIAKELES-NGSGKEGN---LMLQGVRF 360
Query: 477 AFRVHQFAGGFDADSMKAFEDLR 499
A+ +HQFAGGFD +++ F +L+
Sbjct: 361 AYTIHQFAGGFDGETLSIFHELK 383
>AT4G04980.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT1G11070.1); Has 23100 Blast hits to 15699
proteins in 1063 species: Archae - 116; Bacteria - 2262;
Metazoa - 8308; Fungi - 3268; Plants - 3181; Viruses -
958; Other Eukaryotes - 5007 (source: NCBI BLink). |
chr4:2544210-2547893 REVERSE LENGTH=880
Length = 880
Score = 60.8 bits (146), Expect = 2e-09, Method: Compositional matrix adjust.
Identities = 59/241 (24%), Positives = 112/241 (46%), Gaps = 17/241 (7%)
Query: 275 ARSNM---IGEIENRSTFLLAVKADVETQGDFVTSLATEVRAATFSNIDDLVAFVNWLDE 331
ARS M + E+ RS++ ++ DV+ + L + + + ++ +L+ F + ++
Sbjct: 635 ARSGMADALAEMTKRSSYFQQIEEDVQKYAKSIEELKSSIHSFQTKDMKELLEFHSKVES 694
Query: 332 ELSFLVDERAVLKHFD-WPEGKADALREAAFEYQDLMKLEKRVSTFADDPKLSCDAALKK 390
L L DE VL F+ +PE K + +R A Y+ L + + + +P L + L K
Sbjct: 695 ILEKLTDETQVLARFEGFPEKKLEVIRTAGALYKKLDGILVELKNWKIEPPL--NDLLDK 752
Query: 391 MYSLLEKVEQSVYALLRTRDLAISRYKEFGIPVNWLLDSGVVGKIKLSSVQLAKKYMK-- 448
+ K + + + RT+D +K + I + D V+ ++K + V ++ M+
Sbjct: 753 IERYFNKFKGEIETVERTKDEDAKMFKRYNINI----DFEVLVQVKETMVDVSSNCMELA 808
Query: 449 ---RVTSELDALSGSEKEPAREFLT--LQGVRFAFRVHQFAGGFDADSMKAFEDLRNRIQ 503
R + +A +G E + E + +FAF+V+ FAGG D + L + IQ
Sbjct: 809 LKERREANEEAKNGEESKMKEERAKRLWRAFQFAFKVYTFAGGHDERADCLTRQLAHEIQ 868
Query: 504 T 504
T
Sbjct: 869 T 869