The specialized epithelial cell-specific proteome

Epithelial cells form sheets of cells, epithelia, that line the outer and inner surfaces of the body and constitute the building blocks for glandular tissues. In addition to glandular and squamous epithelial cells, there are several other types of epithelial cells, specialized to the purpose of their environment.

  • 3811 elevated genes
  • 301 enriched genes
  • 413 group enriched genes
  • Main function: Organ-specific functions

Transcriptome analysis shows that 80% (n=16143) of all human proteins (n=20090) are detected in specialized epithelial cells and 3811 of these genes show an elevated expression in any specialized epithelial cells compared to other cell type groups. In-depth analysis of the elevated genes in specialized epithelial cells using scRNA-seq and antibody-based protein profiling allowed us to visualize the expression patterns of these proteins in specialized epithelial celltypes of the following tissues: lung, pancreas, liver, kidney, urinary bladder, testis and ovary.


The specialized epithelial cell transcriptome

The scRNA-seq-based specialized epithelial cell transcriptome can be analyzed with regard to specificity, illustrating the number of genes with elevated expression in each specific specialized epithelial cell type compared to other cell types (Table 1). Genes with an elevated expression are divided into three subcategories:

  • Cell type enriched: At least four-fold higher mRNA level in a certain cell type compared to any other cell type.
  • Group enriched: At least four-fold higher average mRNA level in a group of 2-10 cell types compared to any other cell type.
  • Cell type enhanced: At least four-fold higher mRNA level in a cell certain cell type compared to the average level in all other cell types.

Table 1. Number of genes in the subdivided specificity categories of elevated expression in the analyzed specialized epithelial cell types.
Cell type
Tissue origin
Cell type enriched
Group enriched
Cell type enhanced
Total elevated
Alveolar cells type 1 Lung 12 63 394 469
Alveolar cells type 2 Lung 16 69 383 468
Ductal cells Pancreas 2 16 220 238
Hepatocytes Liver 156 114 581 851
Cholangiocytes Liver 3 24 155 182
Proximal tubular cells Kidney 45 100 762 907
Distal tubular cells Kidney 23 40 553 616
Collecting duct cells Kidney 12 24 280 316
Urothelial cells Prostate 3 24 366 393
Sertoli cells Testis 6 43 285 334
Granulosa cells Ovary 23 45 428 496
Any specialized epithelial cells 301 413 3097 3811


Lung


Alveolar cells type 1

As shown in Table 1, 469 genes are elevated in alveolar cells type 1 compared to other cell types. Gas exchange between the air in the lung alveoli and blood takes place via the alveolar cells type 1, which line the alveolar walls. Examples of proteins elevated in alveolar cells type 1 are aquaporin 4 (AQP4), a cell membrane-bound channel that regulates water homeostasis, and claudin 18 (CLDN18), a tight junction protein which prevents the passage of solutes and other molecules through the paracellular space in the epithelium.



AQP4 - lung
02004006008001,0001,200nTPM
AQP4 - lung

AQP4 - lung



CLDN18 - lung
05001,0001,5002,000nTPM
CLDN18 - lung

CLDN18 - lung


Alveolar cells type 2

As shown in Table 1, 468 genes are elevated in alveolar cells type 2 compared to other cell types. Alveolar cells type 2 are located in lung alveoli and produce surfactants which are crucial for the gaseous exchange between air and blood and for lowering surface tension which prevents alveolar collapse. A gene with enriched expression in alveolar cells type 2 is surfactant protein C (SFTPC), which encodes a surfactant protein. Another example is napsin A aspartic peptidase (NAPSA), a protease that may play a role in the proteolytic processing of surfactant protein B.



SFTPC - lung
020,00040,00060,00080,000100,000120,000140,000nTPM
SFTPC - lung

SFTPC - lung



NAPSA - lung
05001,0001,5002,0002,5003,0003,500nTPM
NAPSA - lung

NAPSA - lung


Pancreas


Ductal cells

As shown in Table 1, 238 genes are elevated in ductal epithelial cells compared to other cell types. Ductal epithelial cells in the pancreas can be found throughout the exocrine tissue transporting the secretions from the acini to the duodenum. Cystic fibrosis transmembrane conductance regulator (CFTR) is an example of an elevated gene in the ductal epithelium of the pancreas. CFTR functions as an ion channel transporting Cl- and HCO3- ions out from the cell in an ATP-dependent manner.



CFTR - pancreas
0200400600800nTPM
CFTR - pancreas

CFTR - pancreas


Liver


Hepatocytes

As shown in Table 1, 851 genes are elevated in hepatocytes compared to other cell types. The hepatocytes are the main cell type in the liver and responsible for many of the body's metabolic processes as well as the breakdown of toxic substances. Examples of hepatocyte enhanced genes are retinol dehydrogenase 16 (RDH16) and hydroxyacid oxidase 1 (HAO1) involved in lipid metabolism.



RDH16 - liver
0100200300400500nTPM
RDH16 - liver

RDH16 - liver



HAO1 - liver
0100200300400nTPM
HAO1 - liver

HAO1 - liver


Cholangiocytes

As shown in Table 1, 182 genes are elevated in cholangiocytes compared to other cell types. Cholangiocytes are the epithelial cells of the bile duct system in the liver. Genes with enhanced expression in cholangiocytes are for example transcription factor hepatocyte nuclear factor 1-beta (HNF1B) and transport protein aquaporin 1 (AQP1) which forms a channel for water to move through across the osmotic gradient.



HNF1B - liver
020406080nTPM
HNF1B - liver

HNF1B - liver



AQP1 - liver
0100200300400nTPM
AQP1 - liver

AQP1 - liver


Kidney


Proximal tubular cells

As shown in Table 1, 907 genes are elevated in proximal tubular cells compared to other cell types. Approximately 60% of the filtered Na+, Cl-, K+, Ca2+, H2O and more than 90% of the filtered HCO3- are absorbed along the proximal tubule. This is also the segment that normally reabsorbs virtually all the filtered glucose and amino acids. An additional function is the secretion of numerous organic anions and cations. Examples of proteins that are elevated in the proximal part of the renal tubules are agmatinase (AGMAT), an enzyme involved in the processing of urea and amino acids, and N-acetyltransferase 8 (NAT8), an enzyme that catalyzes the acetylation of cysteine S-conjugates to form mercapturic acids as a part of a detoxification of a wide variety of reactive electrophiles.



AGMAT - kidney
0100200300400500600nTPM
AGMAT - kidney

AGMAT - kidney



NAT8 - kidney
01,0002,0003,0004,0005,0006,000nTPM
NAT8 - kidney

NAT8 - kidney


Distal tubular cells

As shown in Table 1, 616 genes are elevated in distal tubular cells compared to other cell types. Both the distal tubule and collecting duct are the sites where critical regulatory hormones such as aldosterone and vasopressin regulate acid and potassium excretion and determine the final urinary concentration of K+, Na+, and Cl-. Proteins elevated in distal tubules include solute carrier family 12 member 1 (SLC12A1), one of several potassium, sodium, and calcium transporters essential for regulating the contents and volume of urine. Another example is transmembrane protein 52B (TMEM52B), the function of which is not completely characterized but TMEM52B is highly elevated in distal tubular cells.



SLC12A1 - kidney
050100150200250nTPM
SLC12A1 - kidney

SLC12A1 - kidney



TMEM52B - kidney
0200400600800nTPM
TMEM52B - kidney

TMEM52B - kidney


Collecting duct cells

As shown in Table 1, 316 genes are elevated in collecting duct cells compared to other cell types. Collecting ducts cells are the main site of salt and water transport, as well as acid-base regulation. Aquaporin 2 (AQP2) and aquaporin 3 (AQP3) are members of the aquaporin gene family with elevated expression in collecting ducts. These two genes encode water-specific channel proteins that facilitate the reabsorption of water molecules from the urine. Another example of a protein elevated in collecting ducts is FXYD domain containing ion transport regulator 4 (FXYD4). It encodes a protein that regulates the transport of ions across the cell membrane.



AQP2 - kidney
01,0002,0003,0004,0005,000nTPM
AQP2 - kidney

AQP2 - kidney



AQP3 - kidney
05001,0001,5002,000nTPM
AQP3 - kidney

AQP3 - kidney



FXYD4 - kidney
01,0002,0003,0004,0005,0006,000nTPM
FXYD4 - kidney

FXYD4 - kidney


Urinary bladder


Urothelial cells

As shown in Table 1, 393 genes are elevated in urothelial cells compared to other cell types. The urothelium, also known as transitional epithelium is one of the slowest cycling epithelia with a turnover rate of approximately 200 days. It consists of three layers: basal, intermediate and superficial, and is three to seven layers thick, depending on bladder distension. The superficial layer is the only layer and consists of fully differentiated cells which line the lumen as a protective barrier, these cells are called umbrella cells. These cells express transmembrane proteins called uroplakins, UPK2 and UPK1A, which are essential structural components on the apical surface that enhance the permeability barrier.



UPK2 - prostate
0246810nTPM
UPK2 - prostate

UPK2 - urinary bladder



UPK1A - prostate
010203040nTPM
UPK1A - prostate

UPK1A - urinary bladder


Testis


Sertoli cells

As shown in Table 1, 334 genes are elevated in sertoli cells compared to other cell types. Sertoli cells constitute the seminiferous epithelium in testis, interspersed between the germ cells. They play an important role in spermatogenesis, where they are often referred to as nursing cells since their function is to nourish the developing sperm cells. Sertoli cells also play a central role in the control of spermatogenesis by transducing hormonal signals, e.g. activation and stimulation by follicle stimulating hormone (FSH). Examples of elevated genes in Sertoli cells are cannabinoid receptor 1 (CNR1) which is a G-protein coupled receptor for endogenous cannabinoids and epididymal peptidase inhibitor (EPPIN) that has an essential role in male reproduction and fertility by providing antimicrobial protection.



CNR1 - testis
020406080100nTPM
CNR1 - testis

CNR1 - testis



EPPIN - testis
0102030405060nTPM
EPPIN - testis

EPPIN - testis


Ovary


Granulosa cells

As shown in Table 1, 496 genes are elevated in granulosa cells compared to other cell types. Granulosa cells are follicle cells surrounding the oocytes in the ovaries and are believed to originate from ovarian surface epithelium. Their main function is to support the growth and maturation of the oocyte and support eventual pregnancy following ovulation through the production of hormones and growth factors. ELK1 and FOXL2 encodes transcription factors that shows enriched mRNA expression in granulosa cells as well as clear nuclear expression in granulosa cells of ovarian follicles.



ELK1 - ovary
0100200300400nTPM
ELK1 - ovary

ELK1 - ovary



FOXL2 - ovary
050100150200250300nTPM
FOXL2 - ovary

FOXL2 - ovary


Specialized epithelial cell function

Epithelial cells form sheets of cells, epithelia, that line the outer and inner surfaces of the body and constitute the building blocks for glandular tissues. Hence, epithelial cells are found in many parts of the body, including skin, airways, the digestive tract, glandular tissues and organs, as well as the urinary and reproductive systems. The wide range of functions of epithelial cells can be broadly divided into two main categories, being in charge of the transfer of compounds in or out of the body, as well as being a protective barrier against invading pathogens and physical, chemical or biological abrasion.

The histology of organs that contain specialized epithelial cells, including interactive images, is described in the Protein Atlas Histology Dictionary.


Background

Here, the protein-coding genes expressed in specialized epithelial cells are described and characterized, together with examples of immunohistochemically stained tissue sections that visualize corresponding protein expression patterns of genes with elevated expression in different specialized epithelial cell types.

The transcript profiling was based on publicly available genome-wide expression data from scRNA-seq experiments covering 25 tissues and peripheral blood mononuclear cells (PBMCs). All datasets (unfiltered read counts of cells) were clustered separately using louvain clustering, resulting in a total of 444 different cell type clusters. The clusters were then manually annotated based on a survey of known tissue and cell type-specific markers. The scRNA-seq data from each cluster of cells was aggregated to mean normalized protein-coding transcripts per million (nTPM) and the normalized expression value (nTPM) across all protein-coding genes. A specificity and distribution classification was performed to determine the number of genes elevated in these single cell types, and the number of genes detected in one, several or all cell types, respectively.

It should be noted that since the analysis was limited to datasets from 25 tissues and PBMC only, not all human cell types are represented. Furthermore, some cell types are present only in low amounts, or identified only in mixed cell clusters, which may affect the results and bias the cell type specificity.


Relevant links and publications

Uhlén M et al., Tissue-based map of the human proteome. Science (2015)
PubMed: 25613900 DOI: 10.1126/science.1260419

Fagerberg L et al., Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol Cell Proteomics. (2014)
PubMed: 24309898 DOI: 10.1074/mcp.M113.035600

Guo J et al., The adult human testis transcriptional cell atlas. Cell Res. (2018)
PubMed: 30315278 DOI: 10.1038/s41422-018-0099-2

Liao J et al., Single-cell RNA sequencing of human kidney. Sci Data. (2020)
PubMed: 31896769 DOI: 10.1038/s41597-019-0351-8

MacParland SA et al., Single cell RNA sequencing of human liver reveals distinct intrahepatic macrophage populations. Nat Commun. (2018)
PubMed: 30348985 DOI: 10.1038/s41467-018-06318-7

Man L et al., Comparison of Human Antral Follicles of Xenograft versus Ovarian Origin Reveals Disparate Molecular Signatures. Cell Rep. (2020)
PubMed: 32783948 DOI: 10.1016/j.celrep.2020.108027

Qadir MMF et al., Single-cell resolution analysis of the human pancreatic ductal progenitor cell niche. Proc Natl Acad Sci U S A. (2020)
PubMed: 32354994 DOI: 10.1073/pnas.1918314117