The HUGE protein database has been created to publicize the fruits of our Human cDNA project at the Kazusa DNA Research Institute. In this project, we plan to sequence and analyze long (>4 kb) human cDNAs and to establish methods by using the sequence data how to predict the primary structure of proteins of various biological activities. Currently, we focus on the analysis of cDNA clones encoding particularly large proteins (>50 kDa). The basic concept underlying our project and the strategies employed have been described elsewhere (Ohara et al., 1997). Our HUGE protein database contains various types of information derived from the predicted primary structure data of newly identified human proteins. The HUGE protein database are expected to cover various sets of large human proteins of hitherto unidentified functions. They are likely to be involved in cellular structure/motility (such as cytoskeleton, membrane skeleton, and motor proteins), gene expression and nucleic acid metabolism, cell signaling/communication (such as cellular adhesion, signal transduction, channels, and receptors), and so on.