Constructing theories means introducing concepts which are not directly observable. They should, however, explain empirical findings and thus have to be related to observations. Hence, it is useful and common to distinguish observable (visible) from non-observable (hidden) variables. Furthermore, it is often convenient to separate visible variables into dependent variables, representing results of measurements the theory is aiming to explain, and independent variables which are under explicit control and specify the kind of measurements performed.
Hence, we will consider the following three groups of variables
The interpretation will be as follows: Variables represent possible states of (the model of) Nature, being the invisible conditions for dependent variables . The set defines the space of all possible states of Nature for the model under study. We assume that states are not directly observable and all information about comes from observed variables (data) , . A given set of observed data results in a state of knowledge numerically represented by the posterior density over states of Nature.
Independent variables describe the visible conditions (measurement situation, measurement device) under which dependent variables (measurement results) have been observed (measured). According to Eq. (1) they are independent of , i.e., = . The conditional density of the dependent variables is also known as likelihood of (under given ). Vector-valued can be treated as a collection of one-dimensional with the vector index being part of the variable, i.e., with .
In the setting of empirical learning available knowledge is usually separated into a finite number of training data = = and, to make the problem well defined, additional a priori information . For data we write . Hypotheses represent in this setting functions = of two (possibly multidimensional) variables , . In density estimation is a continuous variable (the variable may be constant and thus be skipped), while in classification problems takes only discrete values. In regression problems on assumes to be Gaussian with fixed variance, so the function of interest becomes the regression function .