Welcome to NNE#
This website provides a guide for and the code of the neural net estimator (NNE) (see paper). NNE exploits machine learning techniques to estimate existing econometric models. It is a simulation-based estimator and provides an alternative to simulated maximum likelihood or simulated method of moments. It offers sizable computational and accuracy gains in suitable applications.
Below, we describe an overview of NNE, its step-by-step procedure, and the applicability to marketing/economics problems.
We also provide Matlab code for two applications of NNE: a consumer search model and an AR1 model. The AR1 is a good example to illustrate the concept of NNE, whereas the consumer search application demonstrates computational and accuracy advantages of NNE. You can find the code at this GitHub repository. Please also find the code documentation at the consumer search page and the AR1 model page. You’re welcome to modify the code to estimate other econometric models.
Overview#
A (structural) econometric model specifies some outcome of interest \(\boldsymbol{y}\equiv\{y_i\}_{i=1}^{n}\) as a function \(\boldsymbol{q}\) of some observed attributes \(\boldsymbol{x}\equiv\{\boldsymbol{x}_i\}_{i=1}^{n}\) and some unobserved attributes \(\boldsymbol{\varepsilon}\). The function \(\boldsymbol{q}\) is often an economic model such as random utility maximization, sequential search, game, etc. The outcome of interest \(\boldsymbol{y}\) can be consumer choice, product sales, etc. The observed attributes \(\boldsymbol{x}\) are often consumer and product characteristics.
So we can denote a structural econometric model as \(\boldsymbol{y} = \boldsymbol{q}(\boldsymbol{x}, \boldsymbol{\epsilon}, \boldsymbol{\theta})\), where \(\boldsymbol{\theta}\) is the parameter vector of the model. The task of structural estimation can be described as recovering the parameter \(\boldsymbol{\theta}\) from data \(\{\boldsymbol{x, y}\}\).
The basic idea of NNE is to train neural nets to recognize \(\boldsymbol{\theta}\) from data \(\{\boldsymbol{x, y}\}\).
To train such a neural net, we only require being able to simulate \(\boldsymbol{y}\) using the econometric model. Specifically, we first draw many different values of the parameter. Given each parameter value, use the structural model to generate a copy of data. Then, these many parameter values and their corresponding copies of data become the training examples for the neural net. The step-by-step procedure below gives more details.
Step-by-step procedure#
Below we list the usual procedure of NNE. We use \(\ell\) to index the training examples that we use to train the neural net.
Simulate data. For each \(\ell\), draw parameter vector \(\boldsymbol{\theta}^{(\ell)}\) from a parameter space \(\Theta\). Given this parameter \(\boldsymbol{\theta}^{(\ell)}\) and the observed attributes \(\boldsymbol{x}\), use the structural econometric model to simulate outcomes \(y_i^{(\ell)}\) for \(i=1,...,n\). Let \(\boldsymbol{y}^{(\ell)}\equiv\{y_i^{(\ell)}\}_{i=1}^{n}\).
Summarize data. For each \(\ell\), summarize the data \(\{\boldsymbol{y}^{(\ell)}, \boldsymbol{x}\}\) into a set of data moments \(\boldsymbol{m}^{(\ell)}\).
Train a neural net. Repeat steps 1-3 for \(\ell=1,...,L\) to construct the training examples \(\{\boldsymbol{m}^{(\ell)},\boldsymbol{\theta}^{(\ell)}\}_{\ell=1}^{L}\). We can also repeat steps 1-3 more times to create validation examples. Use these examples to train a neural net.
Get the estimate. Plug the real data moments into the neural net to obtain an estimate of \(\boldsymbol{\theta}\).
Some practical notes:
We specify \(\Theta\) so that it likely contains the true \(\boldsymbol{\theta}\). If we have a prior, we may also draw \(\boldsymbol{\theta}^{(\ell)}\) from the prior distribution.
We specify \(\boldsymbol{m}\) so that it contains relevant information for recovering \(\boldsymbol{\theta}\). Common examples include the mean of \(\boldsymbol{y}\) and the covariances between \(\boldsymbol{y}\) and \(\boldsymbol{x}\). It is generally OK to include possibly irrelevant or redundant moments in \(\boldsymbol{m}\) – the performance of NNE is relatively robust to redundant moments.
We can use mean-square-error loss to train NNE. Other loss functions can train the neural net to give measures of statistical accuracy in addition to point estimates. See the paper referenced below for details.
Applicability#
The increasing complexity of models in economics/marketing means there are often no closed-form expressions of likelihood or moment functions. So reseachers thus rely on simulation-based estimators such as simulated maximum likelihood (SMLE) or simulated method of moments (SMM). NNE is a simulation-based estimator as well. But NNE offers sizable speed and accuracy gains over SMLE/SMM in some applications, making estimation much more tractable. One particular application in marketing that benefits from NNE is consumer sequential search. We have studied it extensively in the paper referenced below. You can find our code on the consumer search page.
The table below summarizes the main properties of NNE as well as its suitable applications.
Main Properties |
|
---|---|
1 |
It does not require computing integrals over the unobservables (\(\boldsymbol{\varepsilon}\)) in the structural econometric model. It only requires being able to simulate data using the econometric model. |
2 |
It does not require optimizing an (potentially non-smooth) objective function as in extremum estimators (e.g., SMLE, SMM, indirect inference). |
3 |
It is more robust to redundant moments when compared to SMM/GMM. |
4 |
It computes a measure of statistical accuracy as a byproduct. |
Suitable Applications |
Less Suitable Applications |
---|---|
A large number of simulations are needed to evaluate likelihood/moments. The SMLE/SMM objective is difficult to optimize. There lacks clear guidance on moment choice. Formulas of standard errors are not yet established. |
Closed-form expressions are available for likelihood/moments. The main estimation burden comes from sources other than the simulations to evaluate likelihood/moments. |
Examples: discrete choices with rich unobserved heterogeneity, sequential search, choices on networks. |
Examples: dynamic choice or games where the main burden is solving policy functions. |
Paper#
“Estimating Parameters of Structural Models with Neural Networks.” 2023.