Deep learningDeep learning is part of a broader family of machine learning methods, which is based on artificial neural networks with representation learning. The adjective "deep" in deep learning refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.
Tumour heterogeneityTumour heterogeneity describes the observation that different tumour cells can show distinct morphological and phenotypic profiles, including cellular morphology, gene expression, metabolism, motility, proliferation, and metastatic potential. This phenomenon occurs both between tumours (inter-tumour heterogeneity) and within tumours (intra-tumour heterogeneity). A minimal level of intra-tumour heterogeneity is a simple consequence of the imperfection of DNA replication: whenever a cell (normal or cancerous) divides, a few mutations are acquired—leading to a diverse population of cancer cells.
Somatic evolution in cancerSomatic evolution is the accumulation of mutations and epimutations in somatic cells (the cells of a body, as opposed to germ plasm and stem cells) during a lifetime, and the effects of those mutations and epimutations on the fitness of those cells. This evolutionary process has first been shown by the studies of Bert Vogelstein in colon cancer. Somatic evolution is important in the process of aging as well as the development of some diseases, including cancer. Cells in pre-malignant and malignant neoplasms (tumors) evolve by natural selection.
Germline mutationA germline mutation, or germinal mutation, is any detectable variation within germ cells (cells that, when fully developed, become sperm and ova). Mutations in these cells are the only mutations that can be passed on to offspring, when either a mutated sperm or oocyte come together to form a zygote. After this fertilization event occurs, germ cells divide rapidly to produce all of the cells in the body, causing this mutation to be present in every somatic and germline cell in the offspring; this is also known as a constitutional mutation.
Hereditary nonpolyposis colorectal cancerHereditary nonpolyposis colorectal cancer (HNPCC) or Lynch syndrome is an autosomal dominant genetic condition that is associated with a high risk of colon cancer as well as other cancers including endometrial cancer (second most common), ovary, stomach, small intestine, hepatobiliary tract, upper urinary tract, brain, and skin. The increased risk for these cancers is due to inherited genetic mutations that impair DNA mismatch repair. It is a type of cancer syndrome.
Feature learningIn machine learning, feature learning or representation learning is a set of techniques that allows a system to automatically discover the representations needed for feature detection or classification from raw data. This replaces manual feature engineering and allows a machine to both learn the features and use them to perform a specific task. Feature learning is motivated by the fact that machine learning tasks such as classification often require input that is mathematically and computationally convenient to process.
Pancreatic neuroendocrine tumorPancreatic neuroendocrine tumours (PanNETs, PETs, or PNETs), often referred to as "islet cell tumours", or "pancreatic endocrine tumours" are neuroendocrine neoplasms that arise from cells of the endocrine (hormonal) and nervous system within the pancreas. PanNETs are a type of neuroendocrine tumor, representing about one-third of gastroenteropancreatic neuroendocrine tumors (GEP-NETs). Many PanNETs are benign, while some are malignant. Aggressive PanNET tumors have traditionally been termed "islet cell carcinoma".
Renal cell carcinomaRenal cell carcinoma (RCC) is a kidney cancer that originates in the lining of the proximal convoluted tubule, a part of the very small tubes in the kidney that transport primary urine. RCC is the most common type of kidney cancer in adults, responsible for approximately 90–95% of cases. RCC occurrence shows a male predominance over women with a ratio of 1.5:1. RCC most commonly occurs between 6th and 7th decade of life. Initial treatment is most commonly either partial or complete removal of the affected kidney(s).
Feedforward neural networkA feedforward neural network (FNN) is one of the two broad types of artificial neural network, characterized by direction of the flow of information between its layers. Its flow is uni-directional, meaning that the information in the model flows in only one direction—forward—from the input nodes, through the hidden nodes (if any) and to the output nodes, without any cycles or loops, in contrast to recurrent neural networks, which have a bi-directional flow.
MutationIn biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, mitosis, or meiosis or other types of damage to DNA (such as pyrimidine dimers caused by exposure to ultraviolet radiation), which then may undergo error-prone repair (especially microhomology-mediated end joining), cause an error during other forms of repair, or cause an error during replication (translesion synthesis).
Machine learningMachine learning (ML) is an umbrella term for solving problems for which development of algorithms by human programmers would be cost-prohibitive, and instead the problems are solved by helping machines 'discover' their 'own' algorithms, without needing to be explicitly told what to do by any human-developed algorithms. Recently, generative artificial neural networks have been able to surpass results of many previous approaches.
Convolutional neural networkConvolutional neural network (CNN) is a regularized type of feed-forward neural network that learns feature engineering by itself via filters (or kernel) optimization. Vanishing gradients and exploding gradients, seen during backpropagation in earlier neural networks, are prevented by using regularized weights over fewer connections. For example, for each neuron in the fully-connected layer 10,000 weights would be required for processing an image sized 100 × 100 pixels.
Recurrent neural networkA recurrent neural network (RNN) is one of the two broad types of artificial neural network, characterized by direction of the flow of information between its layers. In contrast to uni-directional feedforward neural network, it is a bi-directional artificial neural network, meaning that it allows the output from some nodes to affect subsequent input to the same nodes. Their ability to use internal state (memory) to process arbitrary sequences of inputs makes them applicable to tasks such as unsegmented, connected handwriting recognition or speech recognition.
Types of artificial neural networksThere are many types of artificial neural networks (ANN). Artificial neural networks are computational models inspired by biological neural networks, and are used to approximate functions that are generally unknown. Particularly, they are inspired by the behaviour of neurons and the electrical signals they convey between input (such as from the eyes or nerve endings in the hand), processing, and output from the brain (such as reacting to light, touch, or heat). The way neurons semantically communicate is an area of ongoing research.
OncogenomicsOncogenomics is a sub-field of genomics that characterizes cancer-associated genes. It focuses on genomic, epigenomic and transcript alterations in cancer. Cancer is a genetic disease caused by accumulation of DNA mutations and epigenetic alterations leading to unrestrained cell proliferation and neoplasm formation. The goal of oncogenomics is to identify new oncogenes or tumor suppressor genes that may provide new insights into cancer diagnosis, predicting clinical outcome of cancers and new targets for cancer therapies.
Self-supervised learningSelf-supervised learning (SSL) is a paradigm in machine learning for processing data of lower quality, rather than improving ultimate outcomes. Self-supervised learning more closely imitates the way humans learn to classify objects. The typical SSL method is based on an artificial neural network or other model such as a decision list. The model learns in two steps. First, the task is solved based on an auxiliary or pretext classification task using pseudo-labels which help to initialize the model parameters.
Artificial neural networkArtificial neural networks (ANNs, also shortened to neural networks (NNs) or neural nets) are a branch of machine learning models that are built using principles of neuronal organization discovered by connectionism in the biological neural networks constituting animal brains. An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal to other neurons.
Microsatellite instabilityMicrosatellite instability (MSI) is the condition of genetic hypermutability (predisposition to mutation) that results from impaired DNA mismatch repair (MMR). The presence of MSI represents phenotypic evidence that MMR is not functioning normally. MMR corrects errors that spontaneously occur during DNA replication, such as single base mismatches or short insertions and deletions. The proteins involved in MMR correct polymerase errors by forming a complex that binds to the mismatched section of DNA, excises the error, and inserts the correct sequence in its place.
Pancreatic cancerPancreatic cancer arises when cells in the pancreas, a glandular organ behind the stomach, begin to multiply out of control and form a mass. These cancerous cells have the ability to invade other parts of the body. A number of types of pancreatic cancer are known. The most common, pancreatic adenocarcinoma, accounts for about 90% of cases, and the term "pancreatic cancer" is sometimes used to refer only to that type. These adenocarcinomas start within the part of the pancreas that makes digestive enzymes.
Whole genome sequencingWhole genome sequencing (WGS), also known as full genome sequencing, complete genome sequencing, or entire genome sequencing, is the process of determining the entirety, or nearly the entirety, of the DNA sequence of an organism's genome at a single time. This entails sequencing all of an organism's chromosomal DNA as well as DNA contained in the mitochondria and, for plants, in the chloroplast. Whole genome sequencing has largely been used as a research tool, but was being introduced to clinics in 2014.