SARS-CoV-2Severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) is a strain of coronavirus that causes COVID-19, the respiratory illness responsible for the COVID-19 pandemic. The virus previously had the provisional name 2019 novel coronavirus (2019-nCoV), and has also been called human coronavirus 2019 (HCoV-19 or hCoV-19). First identified in the city of Wuhan, Hubei, China, the World Health Organization designated the outbreak a public health emergency of international concern from January 30, 2020, to May 5, 2023.
SARS-CoV-2 Omicron variantOmicron (B.1.1.529) is a variant of SARS-CoV-2 first reported to the World Health Organization (WHO) by the Network for Genomics Surveillance in South Africa on 24 November 2021. It was first detected in Botswana and has spread to become the predominant variant in circulation around the world. Following the original B.1.1.529 variant, several subvariants of Omicron have emerged including: BA.1, BA.2, BA.3, BA.4, and BA.5. Since October 2022, two subvariants of BA.5 called BQ.1 and BQ.1.1 have emerged.
Regression analysisIn statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one or more independent variables (often called 'predictors', 'covariates', 'explanatory variables' or 'features'). The most common form of regression analysis is linear regression, in which one finds the line (or a more complex linear combination) that most closely fits the data according to a specific mathematical criterion.
Linear regressionIn statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable.
Logistic regressionIn statistics, the logistic model (or logit model) is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear combination of one or more independent variables. In regression analysis, logistic regression (or logit regression) is estimating the parameters of a logistic model (the coefficients in the linear combination).
Long COVIDLong COVID or long-haul COVID is a series of health problems persisting or developing after an initial COVID-19 infection. Symptoms can last weeks, months or years and are often debilitating. The World Health Organization defines long COVID to start three months after infection, but other definitions put the start of long COVID at four weeks. Long COVID is characterised by a large number of symptoms. Symptoms sometimes disappear and reappear. Commonly reported symptoms of long COVID are fatigue, memory problems, shortness of breath, and sleep disorder.
Robust regressionIn robust statistics, robust regression seeks to overcome some limitations of traditional regression analysis. A regression analysis models the relationship between one or more independent variables and a dependent variable. Standard types of regression, such as ordinary least squares, have favourable properties if their underlying assumptions are true, but can give misleading results otherwise (i.e. are not robust to assumption violations).
Pathogen transmissionIn medicine, public health, and biology, transmission is the passing of a pathogen causing communicable disease from an infected host individual or group to a particular individual or group, regardless of whether the other individual was previously infected. The term strictly refers to the transmission of microorganisms directly from one individual to another by one or more of the following means: airborne transmission – very small dry and wet particles that stay in the air for long periods of time allowing airborne contamination even after the departure of the host.
Multinomial logistic regressionIn statistics, multinomial logistic regression is a classification method that generalizes logistic regression to multiclass problems, i.e. with more than two possible discrete outcomes. That is, it is a model that is used to predict the probabilities of the different possible outcomes of a categorically distributed dependent variable, given a set of independent variables (which may be real-valued, binary-valued, categorical-valued, etc.).
SARSSevere acute respiratory syndrome (SARS) is a viral respiratory disease of zoonotic origin caused by the virus SARS-CoV-1, the first identified strain of the SARS-related coronavirus. The first known cases occurred in November 2002, and the syndrome caused the 2002–2004 SARS outbreak. In the 2010s, Chinese scientists traced the virus through the intermediary of Asian palm civets to cave-dwelling horseshoe bats in Xiyang Yi Ethnic Township, Yunnan.
Survival analysisSurvival analysis is a branch of statistics for analyzing the expected duration of time until one event occurs, such as death in biological organisms and failure in mechanical systems. This topic is called reliability theory or reliability analysis in engineering, duration analysis or duration modelling in economics, and event history analysis in sociology.
Errors-in-variables modelsIn statistics, errors-in-variables models or measurement error models are regression models that account for measurement errors in the independent variables. In contrast, standard regression models assume that those regressors have been measured exactly, or observed without error; as such, those models account only for errors in the dependent variables, or responses. In the case when some regressors have been measured with errors, estimation based on the standard assumption leads to inconsistent estimates, meaning that the parameter estimates do not tend to the true values even in very large samples.
Hazard ratioIn survival analysis, the hazard ratio (HR) is the ratio of the hazard rates corresponding to the conditions characterised by two distinct levels of a treatment variable of interest. For example, in a clinical study of a drug, the treated population may die at twice the rate per unit time of the control population. The hazard ratio would be 2, indicating higher hazard of death from the treatment. A scientific paper might utilise a Hazard Ratio (HR) to state something as follows.
General linear modelThe general linear model or general multivariate regression model is a compact way of simultaneously writing several multiple linear regression models. In that sense it is not a separate statistical linear model. The various multiple linear regression models may be compactly written as where Y is a matrix with series of multivariate measurements (each column being a set of measurements on one of the dependent variables), X is a matrix of observations on independent variables that might be a design matrix (each column being a set of observations on one of the independent variables), B is a matrix containing parameters that are usually to be estimated and U is a matrix containing errors (noise).
RiskIn simple terms, risk is the possibility of something bad happening. Risk involves uncertainty about the effects/implications of an activity with respect to something that humans value (such as health, well-being, wealth, property or the environment), often focusing on negative, undesirable consequences. Many different definitions have been proposed. The international standard definition of risk for common understanding in different applications is "effect of uncertainty on objectives".
Coronavirus spike proteinSpike (S) glycoprotein (sometimes also called spike protein, formerly known as E2) is the largest of the four major structural proteins found in coronaviruses. The spike protein assembles into trimers that form large structures, called spikes or peplomers, that project from the surface of the virion. The distinctive appearance of these spikes when visualized using negative stain transmission electron microscopy, "recalling the solar corona", gives the virus family its main name.
Nonparametric regressionNonparametric regression is a category of regression analysis in which the predictor does not take a predetermined form but is constructed according to information derived from the data. That is, no parametric form is assumed for the relationship between predictors and dependent variable. Nonparametric regression requires larger sample sizes than regression based on parametric models because the data must supply the model structure as well as the model estimates.
Socioeconomic statusSocioeconomic status (SES) is an economic and sociological combined total measure of a person's work experience and of an individual's or family's economic access to resources and social position in relation to others. When analyzing a family's SES, the household income, earners' education, and occupation are examined, as well as combined income, whereas for an individual's SES only their own attributes are assessed. Recently, research has revealed a lesser recognized attribute of SES as perceived financial stress, as it defines the "balance between income and necessary expenses".
Risk assessmentRisk assessment determines possible mishaps, their likelihood and consequences, and the tolerances for such events. The results of this process may be expressed in a quantitative or qualitative fashion. Risk assessment is an inherent part of a broader risk management strategy to help reduce any potential risk-related consequences. More precisely, risk assessment identifies and analyses potential (future) events that may negatively impact individuals, assets, and/or the environment (i.e. hazard analysis).