The esophagus-specific proteome

The main function of the esophagus is to transport swallowed food and liquids to the stomach. This approximately 25 cm long tube consists of outer layers of striated and smooth muscle, for mechanical propulsion of food, and an inner mucosa lined by non-cornified stratified squamous epithelia. Transcriptome analysis shows that 70% (n=14129) of all human proteins (n=20090) are expressed in the esophagus and 429 of these genes show an elevated expression in the esophagus compared to other tissue types.

  • 429 elevated genes
  • 21 enriched genes
  • 73 group enriched genes
  • Esophagus has most group enriched gene expression in common with vagina

The esophagus transcriptome

Transcriptome analysis of the esophagus can be visualized with regard to the specificity and distribution of transcribed mRNA molecules (Figure 1). Specificity illustrates the number of genes with elevated or non-elevated expression in the esophagus compared to other tissues. Elevated expression includes three subcategory types of elevated expression:

  • Tissue enriched: At least four-fold higher mRNA level in esophagus compared to any other tissues.
  • Group enriched: At least four-fold higher average mRNA level in a group of 2-5 tissues compared to any other tissue.
  • Tissue enhanced: At least four-fold higher mRNA level in esophagus compared to the average level in all other tissues.

Distribution, on the other hand, visualizes how many genes have, or do not have, detectable levels (nTPM≥1) of transcribed mRNA molecules in the esophagus compared to other tissues. As evident in Table 1, all genes elevated in esophagus are categorized as:

  • Detected in single: Detected in a single tissue
  • Detected in some: Detected in more than one but less than one-third of tissues
  • Detected in many: Detected in at least a third but not all tissues
  • Detected in all: Detected in all tissues

A. Specificity

B. Distribution

Figure 1. (A) The distribution of all genes across the five categories based on transcript specificity in esophagus as well as in all other tissues. (B) The distribution of all genes across the six categories, based on transcript detection (nTPM≥1) in esophagus as well as in all other tissues.

As shown in Figure 1, 429 genes show some level of elevated expression in the esophagus compared to other tissues. The three categories of genes with elevated expression in esophagus compared to other organs are shown in Table 1. In Table 2, the 12 genes with the highest enrichment in esophagus are defined.

Table 1. The number of genes in the subdivided categories of elevated expression in esophagus.

Distribution in the 36 tissues
Detected in singleDetected in someDetected in manyDetected in all Total
Tissue enriched 01191 21
Group enriched 045280 73
Tissue enhanced 05522258 335
Total 011125959 429

Table 2. The 12 genes with the highest level of enriched expression in esophagus. "Tissue distribution" describes the transcript detection (nTPM≥1) in esophagus as well as in all other tissues. "mRNA (tissue)" shows the transcript level in esophagus as nTPM values. "Tissue specificity score (TS)" corresponds to the fold-change between the expression level in esophagus and the tissue with the second-highest expression level.

Gene Description Tissue distribution mRNA (tissue) Tissue specificity score
CAPN14 calpain 14 Detected in some 144.1 17
UGT1A7 UDP glucuronosyltransferase family 1 member A7 Detected in some 83.8 12
DYNAP dynactin associated protein Detected in some 29.3 11
KRT4 keratin 4 Detected in many 19735.6 9
CSTB cystatin B Detected in all 11304.7 8
TGM3 transglutaminase 3 Detected in many 1838.1 8
LYPD2 LY6/PLAUR domain containing 2 Detected in some 347.8 7
MUC22 mucin 22 Detected in some 17.8 7
PADI1 peptidyl arginine deiminase 1 Detected in some 109.5 6
SPRR1A small proline rich protein 1A Detected in many 8971.2 5
TGM1 transglutaminase 1 Detected in many 713.5 5
GBP6 guanylate binding protein family member 6 Detected in some 327.3 5

Protein expression of genes elevated in esophagus

In-depth analysis of the elevated genes in the esophagus using antibody-based proteomics allowed us to create an overview of the localization of the corresponding proteins. A large number of these proteins have functions related to squamous differentiation and are thus often also shared with other tissue types that are composed of squamous epithelia.

Proteins specifically expressed in esophageal epithelia

The inner lining of the esophagus consists of glycoprotein-rich mucosal squamous epithelium that lacks an outer layer of cornified cells (as in the skin). Like most squamous epithelia, the esophagus express a variety of keratin intermediate filament proteins whose function is to provide structural integrity between the cells. Among structural proteins, Keratin 4 (KRT4), -6 (KRT6A, KRT6B and KRT6C) -13 (KRT13), and -32 (KRT32) showed high enrichment together with the calcium-binding proteins cornulin (CRNN) and S100A14. KRT13 is primarily expressed in the mucosal epithelia, as MUC21, which is observed in esophageal epithelial cells. An interesting enriched protein is the alcohol-degrading enzyme ADH7, which is observed in mucinous epithelial cells of the esophagus and stomach.




Proteins specifically expressed in esophageal muscle

Among the genes that show enrichment in the esophagus, but do not show protein expression in the epithelial cells, are some genes that show expression in the muscular layers around the epithelial layer. Metal binding protein GNA15 and the transcription factor NKX6-1 are examples of these. Both GNA15 and NKX6-1 appear to be specifically expressed in the muscles in the esophagus and have previously not been described in this tissue.



Gene expression shared between esophagus and other tissues

There are 73 group enriched genes expressed in esophagus. Group enriched genes are defined as genes showing a 4-fold higher average level of mRNA expression in a group of 2-5 tissues, including esophagus, compared to all other tissues.

To illustrate the relation of esophagus tissue to other tissue types, a network plot was generated, displaying the number of genes with a shared expression between different tissue types.

Figure 2. An interactive network plot of the esophagus enriched and group enriched genes connected to their respective enriched tissues (grey circles). Red nodes represent the number of esophagus enriched genes and orange nodes represent the number of genes that are group enriched. The sizes of the red and orange nodes are related to the number of genes displayed within the node. Each node is clickable and results in a list of all enriched genes connected to the highlighted edges. The network is limited to group enriched genes in combinations of up to 3 tissues, but the resulting lists show the complete set of group enriched genes in the particular tissue.

The esophagus shares a great amount of expressed genes (n=18) with the skin and vagina, which are tissues with a highly similar squamous epithelial structure as the esophagus. Many of these group enriched genes belong to gene families known to be important for normal squamous epithelial function as for example keratins like KRT5 and KRT15. Genes associated with desmosomes like DSP and EVPL can also be found as well as some without a known function like TMEM40.

KRT5 - esophagus

KRT5 - skin

KRT5 - vagina

DSP - esophagus

DSP - skin

DSP - vagina

TMEM40 - esophagus

TMEM40 - skin

TMEM40 - vagina

Some lymphoid tissues, like tonsil, have squamous epithelium components (an addition to its lymphocyte containing centers) and a lot of expressed genes related to squamous epithelium are shared between the esophagus and lymphoid tissues (n=21). Examples of the esophagus and lymphoid tissue group enriched proteins include calcium dependent glycoprotein Desmocollin 3 (DSC3) and BARX2, a transcription factor related to genes controlling cell adhesion.

DSC3 - esophagus

BARX2 - esophagus

DSC3 - tonsil

BARX2 - tonsil

Esophagus function

The esophagus is the gastrointestinal canal that connects the mouth with the stomach. In contrast to the rest of the digestive system, the esophagus does not have any absorptive or digestive functions. Anatomically, it is continuous with the back of the oral cavity and pharynx and runs downward through the diaphragm for approximately 20-30 cm until it reaches the stomach.

When swallowing, food is pressed from the mouth and pharynx into the esophagus. The swallowing reflex then opens the upper esophageal sphincter muscle to allow entry of food to the esophagus, and the epiglottis folds down to prevent food from entering into the trachea and respiratory organs. The smooth muscles lining the length of the esophagus then contract rhythmically to help push the food towards the lower esophageal sphincter muscle that opens to allow entry of food to the stomach. Both the upper and lower sphincter muscles are constricted by default unless swallowing/vomiting. The lower sphincter muscles also protect the esophagus from the acidic contents and digestive enzymes of the stomach.

Esophagus histology

The esophagus has the same general gross anatomical and histological organization as the rest of the gastrointestinal tract with an outer muscular layer, a submucosa, a muscularis mucosa, followed by a lamina propria which in the case of the esophagus consists of a stratified squamous mucinous epithelium. However, since the esophagus is located outside of the abdominal cavity it has no mesothelial covering. Instead, the outermost layer is covered by connective tissue called the adventitia.

The innermost part is the esophageal epithelium, which has a quite rapid turnover of cells due to the continuous wear and tear from food ingestion. Like most epithelial tissues, cell renewal takes place in the basal part of the epithelium, and as new cells are generated older cells lose contact with the basal membrane and are pushed towards the surface. Cells close to the basal layer appear columnar with round nuclei. As these cells mature and detach, they are pushed towards the apical layer of the epithelium and gradually differentiate into flattened and tightly coupled cells.

The squamous epithelium rests on the lamina propria that consists of loose connective tissue and focal lymphocytes. Underneath this layer comes the lamina muscularis mucosae composed of smooth muscle cells, followed by the submucosal layer, which is composed of loose connective tissue and mucus-secreting glands, small blood vessels and lymphocytes. Below the submucosa we find the tunica muscularis that is composed of an inner layer of circular muscles, followed by externally located longitudinal muscle fibers. In the upper third of the esophagus towards the mouth, the external layer is composed of skeletal muscle, the middle third contains a mixture of smooth and skeletal muscle, and the third closest to the stomach contains only smooth muscle.

The histology of human esophagus including detailed images and information can be viewed in the Protein Atlas Histology Dictionary.


Here, the protein-coding genes expressed in esophagus are described and characterized, together with examples of immunohistochemically stained tissue sections that visualize corresponding protein expression patterns of genes with elevated expression in esophagus.

Transcript profiling was based on a combination of two transcriptomics datasets (HPA and GTEx), corresponding to a total of 14590 samples from 54 different human normal tissue types. The final consensus normalized expression (nTPM) value for each tissue type was used for the classification of all genes according to the tissue-specific expression into two different categories, based on specificity or distribution.

Relevant links and publications

Uhlén M et al., Tissue-based map of the human proteome. Science (2015)
PubMed: 25613900 DOI: 10.1126/science.1260419

Yu NY et al., Complementing tissue characterization by integrating transcriptome profiling from the Human Protein Atlas and from the FANTOM5 consortium. Nucleic Acids Res. (2015)
PubMed: 26117540 DOI: 10.1093/nar/gkv608

Fagerberg L et al., Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol Cell Proteomics. (2014)
PubMed: 24309898 DOI: 10.1074/mcp.M113.035600

Histology dictionary - the esophagus