Miyakogusa Predicted Gene

Lj2g3v1172540.1
Show Alignment: 

BLASTP 2.2.25 [Feb-01-2011]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Reference for compositional score matrix adjustment: Altschul, Stephen F., 
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.

Query= Lj2g3v1172540.1 Non Chatacterized Hit- tr|I1J6R6|I1J6R6_SOYBN
Uncharacterized protein OS=Glycine max GN=Gma.31297
PE,81.59,0,coiled-coil,NULL; SUBFAMILY NOT NAMED,NULL; FAMILY NOT
NAMED,NULL; seg,NULL,CUFF.36381.1
         (524 letters)

Database: TAIR10_pep 
           35,386 sequences; 14,482,855 total letters

Searching..................................................done



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

AT3G25690.2 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ...   469   e-132
AT3G25690.1 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ...   469   e-132
AT3G25690.3 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ...   469   e-132
AT4G18570.1 | Symbols:  | Tetratricopeptide repeat (TPR)-like su...   312   4e-85
AT1G48280.1 | Symbols:  | hydroxyproline-rich glycoprotein famil...   272   4e-73
AT1G07120.1 | Symbols:  | FUNCTIONS IN: molecular_function unkno...   241   1e-63
AT4G04980.1 | Symbols:  | unknown protein; BEST Arabidopsis thal...    61   2e-09

>AT3G25690.2 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
           family protein | chr3:9354061-9357757 FORWARD
           LENGTH=1004
          Length = 1004

 Score =  469 bits (1207), Expect = e-132,   Method: Compositional matrix adjust.
 Identities = 224/269 (83%), Positives = 252/269 (93%), Gaps = 1/269 (0%)

Query: 237 DKVHRAPELVEFYQSLMKREAKKDTSTLLISS-TSNASDARSNMIGEIENRSTFLLAVKA 295
           +KVHRAPELVEFYQSLMKRE+KK+ +  LISS T N+S AR+NMIGEIENRSTFLLAVKA
Sbjct: 716 NKVHRAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKA 775

Query: 296 DVETQGDFVTSLATEVRAATFSNIDDLVAFVNWLDEELSFLVDERAVLKHFDWPEGKADA 355
           DVETQGDFV SLATEVRA++F++I+DL+AFV+WLDEELSFLVDERAVLKHFDWPEGKADA
Sbjct: 776 DVETQGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHFDWPEGKADA 835

Query: 356 LREAAFEYQDLMKLEKRVSTFADDPKLSCDAALKKMYSLLEKVEQSVYALLRTRDLAISR 415
           LREAAFEYQDLMKLEK+V++F DDP LSC+ ALKKMY LLEKVEQSVYALLRTRD+AISR
Sbjct: 836 LREAAFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKLLEKVEQSVYALLRTRDMAISR 895

Query: 416 YKEFGIPVNWLLDSGVVGKIKLSSVQLAKKYMKRVTSELDALSGSEKEPAREFLTLQGVR 475
           YKEFGIPV+WL D+GVVGKIKLSSVQLAKKYMKRV  ELD++SGS+K+P REFL LQGVR
Sbjct: 896 YKEFGIPVDWLSDTGVVGKIKLSSVQLAKKYMKRVAYELDSVSGSDKDPNREFLLLQGVR 955

Query: 476 FAFRVHQFAGGFDADSMKAFEDLRNRIQT 504
           FAFRVHQFAGGFDA+SMKAFE+LR+R +T
Sbjct: 956 FAFRVHQFAGGFDAESMKAFEELRSRAKT 984



 Score =  149 bits (375), Expect = 6e-36,   Method: Compositional matrix adjust.
 Identities = 84/177 (47%), Positives = 107/177 (60%), Gaps = 28/177 (15%)

Query: 6   KPRGPLESLMIRNAGDGVAITTFGQRDQETTDSPEIPTTPNLRRAPXXXX----XXXXXX 61
           K RGPLESLMIRNAG+ VAITTFGQ DQE+  +PE P  P +R                 
Sbjct: 482 KQRGPLESLMIRNAGESVAITTFGQVDQESPGTPETPNLPRIRTQQQASSPGEGLNSVAA 541

Query: 62  XFHLMSKSIDGSLDEKYPAYKDRHKLALSREKQLKEKAEKARVQKFGDSSNLSISKAERD 121
            FH+MSKS+D  LDEKYPAYKDRHKLA+ REK +K KA++AR ++FG +           
Sbjct: 542 SFHVMSKSVDNVLDEKYPAYKDRHKLAVEREKHIKHKADQARAERFGGN----------- 590

Query: 122 RPISLPPKLTLIKEK----PLVSASANDQ------SDDGK-NVDTQNISKMKLADIE 167
             ++LPPKL  +KEK    P V  +  DQ      S++GK + +   ++KMKL DIE
Sbjct: 591 --VALPPKLAQLKEKRVVVPSVITATGDQSNESNESNEGKASENAATVTKMKLVDIE 645


>AT3G25690.1 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
           family protein | chr3:9354061-9357757 FORWARD
           LENGTH=1004
          Length = 1004

 Score =  469 bits (1207), Expect = e-132,   Method: Compositional matrix adjust.
 Identities = 224/269 (83%), Positives = 252/269 (93%), Gaps = 1/269 (0%)

Query: 237 DKVHRAPELVEFYQSLMKREAKKDTSTLLISS-TSNASDARSNMIGEIENRSTFLLAVKA 295
           +KVHRAPELVEFYQSLMKRE+KK+ +  LISS T N+S AR+NMIGEIENRSTFLLAVKA
Sbjct: 716 NKVHRAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKA 775

Query: 296 DVETQGDFVTSLATEVRAATFSNIDDLVAFVNWLDEELSFLVDERAVLKHFDWPEGKADA 355
           DVETQGDFV SLATEVRA++F++I+DL+AFV+WLDEELSFLVDERAVLKHFDWPEGKADA
Sbjct: 776 DVETQGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHFDWPEGKADA 835

Query: 356 LREAAFEYQDLMKLEKRVSTFADDPKLSCDAALKKMYSLLEKVEQSVYALLRTRDLAISR 415
           LREAAFEYQDLMKLEK+V++F DDP LSC+ ALKKMY LLEKVEQSVYALLRTRD+AISR
Sbjct: 836 LREAAFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKLLEKVEQSVYALLRTRDMAISR 895

Query: 416 YKEFGIPVNWLLDSGVVGKIKLSSVQLAKKYMKRVTSELDALSGSEKEPAREFLTLQGVR 475
           YKEFGIPV+WL D+GVVGKIKLSSVQLAKKYMKRV  ELD++SGS+K+P REFL LQGVR
Sbjct: 896 YKEFGIPVDWLSDTGVVGKIKLSSVQLAKKYMKRVAYELDSVSGSDKDPNREFLLLQGVR 955

Query: 476 FAFRVHQFAGGFDADSMKAFEDLRNRIQT 504
           FAFRVHQFAGGFDA+SMKAFE+LR+R +T
Sbjct: 956 FAFRVHQFAGGFDAESMKAFEELRSRAKT 984



 Score =  149 bits (375), Expect = 6e-36,   Method: Compositional matrix adjust.
 Identities = 84/177 (47%), Positives = 107/177 (60%), Gaps = 28/177 (15%)

Query: 6   KPRGPLESLMIRNAGDGVAITTFGQRDQETTDSPEIPTTPNLRRAPXXXX----XXXXXX 61
           K RGPLESLMIRNAG+ VAITTFGQ DQE+  +PE P  P +R                 
Sbjct: 482 KQRGPLESLMIRNAGESVAITTFGQVDQESPGTPETPNLPRIRTQQQASSPGEGLNSVAA 541

Query: 62  XFHLMSKSIDGSLDEKYPAYKDRHKLALSREKQLKEKAEKARVQKFGDSSNLSISKAERD 121
            FH+MSKS+D  LDEKYPAYKDRHKLA+ REK +K KA++AR ++FG +           
Sbjct: 542 SFHVMSKSVDNVLDEKYPAYKDRHKLAVEREKHIKHKADQARAERFGGN----------- 590

Query: 122 RPISLPPKLTLIKEK----PLVSASANDQ------SDDGK-NVDTQNISKMKLADIE 167
             ++LPPKL  +KEK    P V  +  DQ      S++GK + +   ++KMKL DIE
Sbjct: 591 --VALPPKLAQLKEKRVVVPSVITATGDQSNESNESNEGKASENAATVTKMKLVDIE 645


>AT3G25690.3 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
           family protein | chr3:9354061-9357757 FORWARD LENGTH=863
          Length = 863

 Score =  469 bits (1207), Expect = e-132,   Method: Compositional matrix adjust.
 Identities = 224/269 (83%), Positives = 252/269 (93%), Gaps = 1/269 (0%)

Query: 237 DKVHRAPELVEFYQSLMKREAKKDTSTLLISS-TSNASDARSNMIGEIENRSTFLLAVKA 295
           +KVHRAPELVEFYQSLMKRE+KK+ +  LISS T N+S AR+NMIGEIENRSTFLLAVKA
Sbjct: 575 NKVHRAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKA 634

Query: 296 DVETQGDFVTSLATEVRAATFSNIDDLVAFVNWLDEELSFLVDERAVLKHFDWPEGKADA 355
           DVETQGDFV SLATEVRA++F++I+DL+AFV+WLDEELSFLVDERAVLKHFDWPEGKADA
Sbjct: 635 DVETQGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHFDWPEGKADA 694

Query: 356 LREAAFEYQDLMKLEKRVSTFADDPKLSCDAALKKMYSLLEKVEQSVYALLRTRDLAISR 415
           LREAAFEYQDLMKLEK+V++F DDP LSC+ ALKKMY LLEKVEQSVYALLRTRD+AISR
Sbjct: 695 LREAAFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKLLEKVEQSVYALLRTRDMAISR 754

Query: 416 YKEFGIPVNWLLDSGVVGKIKLSSVQLAKKYMKRVTSELDALSGSEKEPAREFLTLQGVR 475
           YKEFGIPV+WL D+GVVGKIKLSSVQLAKKYMKRV  ELD++SGS+K+P REFL LQGVR
Sbjct: 755 YKEFGIPVDWLSDTGVVGKIKLSSVQLAKKYMKRVAYELDSVSGSDKDPNREFLLLQGVR 814

Query: 476 FAFRVHQFAGGFDADSMKAFEDLRNRIQT 504
           FAFRVHQFAGGFDA+SMKAFE+LR+R +T
Sbjct: 815 FAFRVHQFAGGFDAESMKAFEELRSRAKT 843



 Score =  149 bits (375), Expect = 6e-36,   Method: Compositional matrix adjust.
 Identities = 84/177 (47%), Positives = 107/177 (60%), Gaps = 28/177 (15%)

Query: 6   KPRGPLESLMIRNAGDGVAITTFGQRDQETTDSPEIPTTPNLRRAPXXXX----XXXXXX 61
           K RGPLESLMIRNAG+ VAITTFGQ DQE+  +PE P  P +R                 
Sbjct: 341 KQRGPLESLMIRNAGESVAITTFGQVDQESPGTPETPNLPRIRTQQQASSPGEGLNSVAA 400

Query: 62  XFHLMSKSIDGSLDEKYPAYKDRHKLALSREKQLKEKAEKARVQKFGDSSNLSISKAERD 121
            FH+MSKS+D  LDEKYPAYKDRHKLA+ REK +K KA++AR ++FG +           
Sbjct: 401 SFHVMSKSVDNVLDEKYPAYKDRHKLAVEREKHIKHKADQARAERFGGN----------- 449

Query: 122 RPISLPPKLTLIKEK----PLVSASANDQ------SDDGK-NVDTQNISKMKLADIE 167
             ++LPPKL  +KEK    P V  +  DQ      S++GK + +   ++KMKL DIE
Sbjct: 450 --VALPPKLAQLKEKRVVVPSVITATGDQSNESNESNEGKASENAATVTKMKLVDIE 504


>AT4G18570.1 | Symbols:  | Tetratricopeptide repeat (TPR)-like
           superfamily protein | chr4:10231439-10234534 FORWARD
           LENGTH=642
          Length = 642

 Score =  312 bits (799), Expect = 4e-85,   Method: Compositional matrix adjust.
 Identities = 159/280 (56%), Positives = 208/280 (74%), Gaps = 15/280 (5%)

Query: 234 MAGDKVHRAPELVEFYQSLMKREA---KKDTS------TLLISSTSNASDARSNMIGEIE 284
           +A  KV R PE+VEFY SLM+R++   ++D++         I + SNA D    MIGEIE
Sbjct: 348 IASAKVRRVPEVVEFYHSLMRRDSTNSRRDSTGGGNAAAEAILANSNARD----MIGEIE 403

Query: 285 NRSTFLLAVKADVETQGDFVTSLATEVRAATFSNIDDLVAFVNWLDEELSFLVDERAVLK 344
           NRS +LLA+K DVETQGDF+  L  EV  A FS+I+D+V FV WLD+ELS+LVDERAVLK
Sbjct: 404 NRSVYLLAIKTDVETQGDFIRFLIKEVGNAAFSDIEDVVPFVKWLDDELSYLVDERAVLK 463

Query: 345 HFDWPEGKADALREAAFEYQDLMKLEKRVSTFADDPKLSCDAALKKMYSLLEKVEQSVYA 404
           HF+WPE KADALREAAF Y DL KL    S F +DP+ S  +ALKKM +L EK+E  VY+
Sbjct: 464 HFEWPEQKADALREAAFCYFDLKKLISEASRFREDPRQSSSSALKKMQALFEKLEHGVYS 523

Query: 405 LLRTRDLAISRYKEFGIPVNWLLDSGVVGKIKLSSVQLAKKYMKRVTSELDALSGSEKEP 464
           L R R+ A +++K F IPV+W+L++G+  +IKL+SV+LA KYMKRV++EL+A+ G   E 
Sbjct: 524 LSRMRESAATKFKSFQIPVDWMLETGITSQIKLASVKLAMKYMKRVSAELEAIEGGGPEE 583

Query: 465 AREFLTLQGVRFAFRVHQFAGGFDADSMKAFEDLRNRIQT 504
               L +QGVRFAFRVHQFAGGFDA++MKAFE+LR++ ++
Sbjct: 584 EE--LIVQGVRFAFRVHQFAGGFDAETMKAFEELRDKARS 621


>AT1G48280.1 | Symbols:  | hydroxyproline-rich glycoprotein family
           protein | chr1:17835196-17837553 FORWARD LENGTH=558
          Length = 558

 Score =  272 bits (696), Expect = 4e-73,   Method: Compositional matrix adjust.
 Identities = 122/265 (46%), Positives = 190/265 (71%)

Query: 238 KVHRAPELVEFYQSLMKREAKKDTSTLLISSTSNASDARSNMIGEIENRSTFLLAVKADV 297
           +  ++P + + +Q L K++  ++ S  +  + S  + A ++++GEI+NRS  L+A+KAD+
Sbjct: 278 RAQKSPPVSQLFQLLNKQDNSRNLSQSVNGNKSQVNSAHNSIVGEIQNRSAHLIAIKADI 337

Query: 298 ETQGDFVTSLATEVRAATFSNIDDLVAFVNWLDEELSFLVDERAVLKHFDWPEGKADALR 357
           ET+G+F+  L  +V    FS+++D++ FV+WLD+EL+ L DERAVLKHF WPE KAD L+
Sbjct: 338 ETKGEFINDLIQKVLTTCFSDMEDVMKFVDWLDKELATLADERAVLKHFKWPEKKADTLQ 397

Query: 358 EAAFEYQDLMKLEKRVSTFADDPKLSCDAALKKMYSLLEKVEQSVYALLRTRDLAISRYK 417
           EAA EY++L KLEK +S+++DDP +    ALKKM +LL+K EQ +  L+R R  ++  Y+
Sbjct: 398 EAAVEYRELKKLEKELSSYSDDPNIHYGVALKKMANLLDKSEQRIRRLVRLRGSSMRSYQ 457

Query: 418 EFGIPVNWLLDSGVVGKIKLSSVQLAKKYMKRVTSELDALSGSEKEPAREFLTLQGVRFA 477
           +F IPV W+LDSG++ KIK +S++LAK YM RV +EL +    ++E  +E L LQGVRFA
Sbjct: 458 DFKIPVEWMLDSGMICKIKRASIKLAKTYMNRVANELQSARNLDRESTKEALLLQGVRFA 517

Query: 478 FRVHQFAGGFDADSMKAFEDLRNRI 502
           +R HQFAGG D +++ A E+++ R+
Sbjct: 518 YRTHQFAGGLDPETLCALEEIKQRV 542


>AT1G07120.1 | Symbols:  | FUNCTIONS IN: molecular_function unknown;
           INVOLVED IN: biological_process unknown; LOCATED IN:
           chloroplast envelope; EXPRESSED IN: inflorescence
           meristem, petal, leaf whorl, flower; EXPRESSED DURING: 4
           anthesis, petal differentiation and expansion stage;
           BEST Arabidopsis thaliana protein match is:
           Tetratricopeptide repeat (TPR)-like superfamily protein
           (TAIR:AT4G18570.1); Has 288 Blast hits to 260 proteins
           in 50 species: Archae - 0; Bacteria - 8; Metazoa - 27;
           Fungi - 15; Plants - 163; Viruses - 0; Other Eukaryotes
           - 75 (source: NCBI BLink). | chr1:2184874-2186580
           REVERSE LENGTH=392
          Length = 392

 Score =  241 bits (615), Expect = 1e-63,   Method: Compositional matrix adjust.
 Identities = 126/263 (47%), Positives = 178/263 (67%), Gaps = 9/263 (3%)

Query: 239 VHRAPELVEFYQSLMKREAKKDTSTLLISSTSNASDA-RSNMIGEIENRSTFLLAVKADV 297
           V RAPE+VEFY++L KRE+        I+     S A   NMIGEIENRS +L  +K+D 
Sbjct: 128 VRRAPEVVEFYRALTKRESHMGNK---INQNGVLSPAFNRNMIGEIENRSKYLSDIKSDT 184

Query: 298 ETQGDFVTSLATEVRAATFSNIDDLVAFVNWLDEELSFLVDERAVLKHF-DWPEGKADAL 356
           +   D +  L ++V AATF++I ++  FV W+DEELS LVDERAVLKHF  WPE K D+L
Sbjct: 185 DRHRDHIHILISKVEAATFTDISEVETFVKWIDEELSSLVDERAVLKHFPKWPERKVDSL 244

Query: 357 REAAFEYQDLMKLEKRVSTFADDPKLSCDAALKKMYSLLEKVEQSVYALLRTRDLAISRY 416
           REAA  Y+    L   + +F D+PK S   AL+++ SL +++E+SV    + RD    RY
Sbjct: 245 REAACNYKRPKNLGNEILSFKDNPKDSLTQALQRIQSLQDRLEESVNNTEKMRDSTGKRY 304

Query: 417 KEFGIPVNWLLDSGVVGKIKLSSVQLAKKYMKRVTSELDALSGSEKEPAREFLTLQGVRF 476
           K+F IP  W+LD+G++G++K SS++LA++YMKR+  EL++ +GS KE     L LQGVRF
Sbjct: 305 KDFQIPWEWMLDTGLIGQLKYSSLRLAQEYMKRIAKELES-NGSGKEGN---LMLQGVRF 360

Query: 477 AFRVHQFAGGFDADSMKAFEDLR 499
           A+ +HQFAGGFD +++  F +L+
Sbjct: 361 AYTIHQFAGGFDGETLSIFHELK 383


>AT4G04980.1 | Symbols:  | unknown protein; BEST Arabidopsis
           thaliana protein match is: unknown protein
           (TAIR:AT1G11070.1); Has 23100 Blast hits to 15699
           proteins in 1063 species: Archae - 116; Bacteria - 2262;
           Metazoa - 8308; Fungi - 3268; Plants - 3181; Viruses -
           958; Other Eukaryotes - 5007 (source: NCBI BLink). |
           chr4:2544210-2547893 REVERSE LENGTH=880
          Length = 880

 Score = 60.8 bits (146), Expect = 2e-09,   Method: Compositional matrix adjust.
 Identities = 59/241 (24%), Positives = 112/241 (46%), Gaps = 17/241 (7%)

Query: 275 ARSNM---IGEIENRSTFLLAVKADVETQGDFVTSLATEVRAATFSNIDDLVAFVNWLDE 331
           ARS M   + E+  RS++   ++ DV+     +  L + + +    ++ +L+ F + ++ 
Sbjct: 635 ARSGMADALAEMTKRSSYFQQIEEDVQKYAKSIEELKSSIHSFQTKDMKELLEFHSKVES 694

Query: 332 ELSFLVDERAVLKHFD-WPEGKADALREAAFEYQDLMKLEKRVSTFADDPKLSCDAALKK 390
            L  L DE  VL  F+ +PE K + +R A   Y+ L  +   +  +  +P L  +  L K
Sbjct: 695 ILEKLTDETQVLARFEGFPEKKLEVIRTAGALYKKLDGILVELKNWKIEPPL--NDLLDK 752

Query: 391 MYSLLEKVEQSVYALLRTRDLAISRYKEFGIPVNWLLDSGVVGKIKLSSVQLAKKYMK-- 448
           +     K +  +  + RT+D     +K + I +    D  V+ ++K + V ++   M+  
Sbjct: 753 IERYFNKFKGEIETVERTKDEDAKMFKRYNINI----DFEVLVQVKETMVDVSSNCMELA 808

Query: 449 ---RVTSELDALSGSEKEPAREFLT--LQGVRFAFRVHQFAGGFDADSMKAFEDLRNRIQ 503
              R  +  +A +G E +   E      +  +FAF+V+ FAGG D  +      L + IQ
Sbjct: 809 LKERREANEEAKNGEESKMKEERAKRLWRAFQFAFKVYTFAGGHDERADCLTRQLAHEIQ 868

Query: 504 T 504
           T
Sbjct: 869 T 869