Seedsman Blog
Home » Here’s What We Know About The Cannabis Genome

Here’s What We Know About The Cannabis Genome

Given the huge economic potential of the legal cannabis market, it’s unsurprising that researchers are now racing to map the cannabis genome. By identifying the genes responsible for cannabinoid synthesis, pathogen resistance and plant morphology, commercial breeders will be much better equipped to create new cultivars with specific genetic traits. However, unlike many other plant species, cannabis has yet to give up the secrets hidden within its genome, and scientists are still striving to create a comprehensive map of the plant’s DNA.

How The Cannabis Genome Affects Cannabinoids

Dr Gary Yates, Plant Science Research Associate at PharmaSeeds, told Seedsman that “one of the reasons why the cannabis genome project is lagging behind compared to a lot of other crops is that a lot of companies are keeping their info in-house, and the publicly available information is not exactly clear.”

As a consequence, “there’s a bit of a conundrum in understanding what genetic factors influence the enzymes that make certain desirable products, such as THC and CBD.”

Fortunately, though, the answers are beginning reveal themselves, like pieces of a jigsaw puzzle slowly falling into place. Back in the early 2000s, however, scientists were still essentially staring at a pile of unsorted jigsaw pieces, and were scrambling around to find corners and edges in order begin putting the puzzle together. Their first challenge was to try and locate the genetic mechanisms behind the creation of cannabinoids.

Even at that time, it was already known that tetrahydrocannabinolic acid (THCA) and cannabidiolic acid – the forerunners to THC and CBD – were both produced from their common precursor, cannabigerolic acid (CBGA). What wasn’t known, however, was whether the enzymes that allow for the creation of THCA and CBDA were coded for by one gene with two variants, or by two completely separate genes.

Early attempts to map the cannabis genome seemed to suggest that one gene was responsible for the production of both THC and CBD, although a more thorough study, released in 2011, threw this theory into doubt.[i]

It wasn’t until 2018 that this uncertainty was finally put to bed, thanks to a next-generation genetic sequencing technique called single molecule sequencing. Able to produce long, unfragmented read-outs of DNA, this method was used to produce the first comprehensive map of the Purple Kush genome.

Appearing in the journal Genome Research[ii], the study successfully arranged the plant’s DNA into ten chromosomes. Of particular interest was chromosome six, which was found to contain genes that coded for the enzyme THCA synthase – the molecule responsible for the production of THCA.

Located on the same chromosome, but some 20 million nucleotides down the line, was a second gene that coded for CBDA synthase. In other words, the study proved once and for all that the enzymes responsible for synthesising THC and CBD are coded for by separate genes.

So Why Does Hemp Contain Trace Amounts Of THC?

While this finding represented a massive step forward in our understanding of the cannabis genome, it also posed one massive question. After all, if THC is coded for by its own discrete gene, and hemp cultivars are bred specifically to lack this genetic trait, then logically they should contain no THC at all. Yet most non-drug varieties do in fact contain trace amounts of this cannabinoid, suggesting that it is still being synthesised in low levels even in the absence of the THC synthase gene.

The solution to this riddle may lie in a paper that was written in 2020 but has yet to be peer reviewed. By sequencing the genomes of 42 cannabis cultivars, the study authors noted that many plants that have been bred to lack THC synthase contain a cluster of genes that code for an enzyme called cannabichromenic acid synthase. While the main function of this enzyme is to produce the non-psychoactive cannabinoid cannabichromene (CBC), the fact that it shares 96 percent of its genetic code with THC synthase means that it probably also produces small amounts of THCA as a by-product.[iii]

The Cannabis Genome And Pathogen Resistance

Pathogen resistance is a complex issue to get to grips with, as many different genes can play a role in protecting plants from many different pathogens. That said, some progress has been made in determining the genetic factors that provide resistance to a familiar foe for cannabis growers: powdery mildew.

A number of genes that can increase an organism’s susceptibility to the fungus have already been identified across various plant species, and have been labelled mildew resistance locus O, or MLOs. When these genes are deleted, either by random mutations or selective breeding, plants tend to become much better at fighting off the pathogen.

For this reason, geneticists are keen to locate MLOs within the cannabis genome. In the aforementioned preprint study, 24 different MLOs were identified across the 42 cannabinoid cultivars. As expected, varieties that were known to be particularly resistant to powdery mildew tended to lack these MLOs, giving an insight into the genetic basis for pathogen resistance in cannabis.

Yet the story doesn’t end there, as a number of other genetic factors were found to be involved in protecting cannabis from unwanted invaders. For example, molecules called thaumatin-like proteins (TLPs) are well known to increase resistance to certain pathogens in plants like grapes and hops, and cannabis varieties that lacked the genes that code for TLPs were found to be more susceptible to powdery mildew.

In addition, some of the genes responsible for cannabinoid synthesis were found to play a role in pathogen resistance. For instance, the bundle of genes giving rise to CBCA synthase appears to contain stetches of anti-viral DNA. As a consequence, plants that lack CBC were notable for their lack of resistance to various diseases.

Overall, there are clearly still a number of gaps in our understanding of the cannabis genome, although things are slowly starting to fall into place. In terms of where to look next, Yates explains that “cannabis follows XY sex determination, similar to humans, but no one’s mapped the X and Y chromosomes, which is another big hole in the information circle.”

Given the recent explosion of interest in the cannabis genome, it shouldn’t be too long before the many mysteries it contains are illuminated.

[i] Van Bakel H, Stout JM, Cote AG, Tallon CM, Sharpe AG, Hughes TR, Page JE. The draft genome and transcriptome of Cannabis sativa. Genome biology. 2011 Oct;12(10):1-8. –

[ii] Laverty KU, Stout JM, Sullivan MJ, Shah H, Gill N, Holbrook L, Deikus G, Sebra R, Hughes TR, Page JE, Van Bakel H. A physical and genetic map of Cannabis sativa identifies extensive rearrangements at the THC/CBD acid synthase loci. Genome research. 2019 Jan 1;29(1):146-56. –

[iii] McKernan KJ, Helbert Y, Kane LT, Ebling H, Zhang L, Liu B, Eaton Z, McLaughlin S, Kingan S, Baybayan P, Concepcion G. Sequence and annotation of 42 cannabis genomes reveals extensive copy number variation in cannabinoid synthesis and pathogen resistance genes. BioRxiv. 2020 Jan 1. –

Cultivation information, and media is given for those of our clients who live in countries where cannabis cultivation is decriminalised or legal, or to those that operate within a licensed model. We encourage all readers to be aware of their local laws and to ensure they do not break them.

This post is also available in: French

Ben Taub