Tutorial: Flexibility and Modularity

Working with configurations and inputs/hiddens/params hooks! Modularity is key to OpenDeep.

This tutorial builds on the Denoising Autoencoder built from Tutorial: Your First Model (DAE).

Our naïve implementation of the denoising autoencoder works great for personal use on the MNIST dataset, but it isn't very practical to have a model for one dataset. In this tutorial, you will see how to make a model flexible via configurations, and to deal with hooks for linking models together.

Configurations and defaults

With OpenDeep, it is good practice to be able to pass model hyperparameters to the class in a few ways:

  • a JSON file that will be parsed into a dictionary
  • a YAML file that will be parsed into a dictionary
  • a dictionary-like object directly (which means it has a .get() method)

It would also be good to have a "default" configuration with the ability to override via a "config", or via parameters directly passed into the model.

Luckily there is some code to make this really easy! The __init__ method of the Model class handles parsing the "default" and "config" dictionaries as well as the explicitly passed parameters into one variable - self.args. You can access any configuration parameters from this variable as well as directly from the self reference. Here is an example modifying our class's __init__ from before to handle configurations of hyperparameters:

import theano.tensor as T
from opendeep.models.model import Model
from opendeep.utils.nnet import get_weights_uniform, get_bias
from opendeep.utils.noise import salt_and_pepper
from opendeep.utils.activation import get_activation_function
from opendeep.utils.cost import get_cost_function

# create our class initialization!
class DenoisingAutoencoder(Model):
    """
    A denoising autoencoder will corrupt an input (add noise) and try to reconstruct it.
    """
    # Best practice to define all the default parameters up here
    # Provide comments when giving parameters so people know what they are for!
    _defaults = {
        "input_size": 28*28, # dimensionality of input - works for MNIST
        "hidden_size": 1000, # number of hidden units
        "corruption_level": 0.4, # how much noise to add to the input
        "hidden_activation": 'tanh', # the nonlinearity to apply to hidden units
        "visible_activation": 'sigmoid', # the nonlinearity to apply to visible units
        "cost_function": 'binary_crossentropy' # the cost function to use during training
    }
    def __init__(self, config=None, defaults=_defaults, input_size=None, hidden_size=None, corruption_level=None, hidden_activation=None, visible_activation=None, cost_function=None):
        # Now, initialize with Model class with all of the initialization variables passed to the denoising autoencoder, except for the self reference.
        # The parameters will be combined by overriding each other in this order:
        # defaults < config < explicitly passed parameters
        super(DenoisingAutoencoder, self).__init__(
            **{arg: val for (arg, val) in locals().iteritems() if arg is not 'self'}
        )
        # These parameters are now accessible from the 'self.args' dictionary, as well as directly accessing from the 'self' reference!

        # use the helper methods to grab appropriate activation functions from names!
        hidden_activation  = get_activation_function(self.hidden_activation)
        visible_activation = get_activation_function(self.visible_activation)

        # do the same for the cost function
        cost_function  = get_cost_function(self.cost_function)

        # Now, define the symbolic input to the model (Theano)
        # We use a matrix rather than a vector so that minibatch processing can be done in parallel.
        x = T.fmatrix("X")
        self.inputs = [x]

        # Build the model's parameters - a weight matrix and two bias vectors
        W  = get_weights_uniform(shape=(self.input_size, self.hidden_size), name="W")
        b0 = get_bias(shape=self.input_size, name="b0")
        b1 = get_bias(shape=self.hidden_size, name="b1")
        self.params = [W, b0, b1]

        # Perform the computation for a denoising autoencoder!
        # first, add noise (corrupt) the input
        corrupted_input = salt_and_pepper(input=x, corruption_level=self.corruption_level)
        # next, compute the hidden layer given the inputs (the encoding function)
        hiddens = hidden_activation(T.dot(corrupted_input, W) + b1)
        # finally, create the reconstruction from the hidden layer (we tie the weights with W.T)
        reconstruction = visible_activation(T.dot(hiddens, W.T) + b0)
        # the training cost is reconstruction error
        self.train_cost = cost_function(output=reconstruction, target=x)

        # Compile everything into a Theano function for prediction!
        # When using real-world data in predictions, we wouldn't corrupt the input first.
        # Therefore, create another version of the hiddens and reconstruction without adding the noise
        hiddens_predict = hidden_activation(T.dot(x, W) + b1)
        recon_predict   = visible_activation(T.dot(hiddens_predict, W.T) + b0)

Not too bad! You just made the model infinitely more customizable!

Modularity with other models via hooks

Finally, a key concept in OpenDeep is maintaining modularity between models. For this purpose, there are three inputs a model should handle when applicable:

  • inputs_hook: this is a tuple of (size, variable) for an input to your model. It should replace the default input you have set up (in our case, the matrix "X").
  • hiddens_hook: this is also a tuple of (size, variable) for the hidden representation of your model. It only makes sense in the case of generative models (like denoising autoencoders) where you can compute an output given the hiddens. Therefore hiddens_hook is optional most of the time (unless you have a generative model). This hook is up to your discretion for how the model should handle taking a hidden representation as its input.
  • params_hook: a list of variables to replace the default parameters used by your model. This enables models to be 'tied' together by using the same parameters. In our case, we should handle replacing "W", "b0", and "b1".

It is easier to explain via example with comments:

import theano.tensor as T
from opendeep import function
from opendeep.models.model import Model
from opendeep.utils.nnet import get_weights_uniform, get_bias
from opendeep.utils.noise import salt_and_pepper
from opendeep.utils.activation import get_activation_function
from opendeep.utils.cost import get_cost_function

# create our class initialization!
class DenoisingAutoencoder(Model):
    """
    A denoising autoencoder will corrupt an input (add noise) and try to reconstruct it.
    """
    # Best practice to define all the default parameters up here
    # Provide comments when giving parameters so people know what they are for!
    _defaults = {
        "input_size": 28*28,  # dimensionality of input - works for MNIST
        "hidden_size": 1000,  # number of hidden units
        "corruption_level": 0.4,  # how much noise to add to the input
        "hidden_activation": 'tanh',  # the nonlinearity to apply to hidden units
        "visible_activation": 'sigmoid',  # the nonlinearity to apply to visible units
        "cost_function": 'binary_crossentropy'  # the cost function to use during training
    }
    def __init__(self, config=None, defaults=_defaults,
                 inputs_hook=None, hiddens_hook=None, params_hook=None,
                 input_size=None, hidden_size=None, corruption_level=None,
                 hidden_activation=None, visible_activation=None, cost_function=None):
        # Now, initialize with Model class to combine config and defaults!
        # Here, defaults is defined via a dictionary. However, you could also
        # pass a filename to a JSON or YAML file with the same format.
        super(DenoisingAutoencoder, self).__init__(
            **{arg: val for (arg, val) in locals().iteritems() if arg is not 'self'}
        )
        # Any parameter from the 'config' will overwrite the 'defaults' dictionary, which will be overwritten if the
        # parameter is passed directly to __init__.
        # These parameters are now accessible from the 'self' variable!

        # Define model hyperparameters
        # deal with the inputs_hook and hiddens_hook for the size parameters!
        # if the hook exists, grab the size from the first element of the tuple.
        if self.inputs_hook is not None:
            assert len(self.inputs_hook) == 2, "Was expecting inputs_hook to be a tuple."
            self.input_size = inputs_hook[0]

        if self.hiddens_hook is not None:
            assert len(self.hiddens_hook) == 2, "was expecting hiddens_hook to be a tuple."
            self.hidden_size = hiddens_hook[0]


        # use the helper methods to grab appropriate activation functions from names!
        hidden_activation  = get_activation_function(self.hidden_activation)
        visible_activation = get_activation_function(self.visible_activation)

        # do the same for the cost function
        cost_function = get_cost_function(self.cost_function)

        # Now, define the symbolic input to the model (Theano)
        # We use a matrix rather than a vector so that minibatch processing can be done in parallel.
        # Make sure to deal with 'inputs_hook' if it exists!
        if self.inputs_hook is not None:
            # grab the new input variable from the inputs_hook tuple
            x = self.inputs_hook[1]
        else:
            x = T.fmatrix("X")
        self.inputs = [x]

        # Build the model's parameters - a weight matrix and two bias vectors
        # Make sure to deal with 'params_hook' if it exists!
        if self.params_hook:
            # check to see if it contains the three necessary variables
            assert len(self.params_hook) == 3, "Not correct number of params to DAE, needs 3!"
            W, b0, b1 = self.params_hook
        else:
            W  = get_weights_uniform(shape=(self.input_size, self.hidden_size), name="W")
            b0 = get_bias(shape=self.input_size, name="b0")
            b1 = get_bias(shape=self.hidden_size, name="b1")
        self.params = [W, b0, b1]

        # Perform the computation for a denoising autoencoder!
        # first, add noise (corrupt) the input
        corrupted_input = salt_and_pepper(input=x, corruption_level=self.corruption_level)
        # next, compute the hidden layer given the inputs (the encoding function)
        # We don't need to worry about hiddens_hook during training, because we can't
        # compute a cost without having the input!
        # hiddens_hook is more for the predict function and linking methods below.
        hiddens = hidden_activation(T.dot(corrupted_input, W) + b1)
        # finally, create the reconstruction from the hidden layer (we tie the weights with W.T)
        reconstruction = visible_activation(T.dot(hiddens, W.T) + b0)
        # the training cost is reconstruction error
        self.train_cost = cost_function(output=reconstruction, target=x)

        # Compile everything into a Theano function for prediction!
        # When using real-world data in predictions, we wouldn't corrupt the input first.
        # Therefore, create another version of the hiddens and reconstruction without adding the noise.
        # Here is where we would handle hiddens_hook because this is a generative model!
        # For the predict function, it would take in the hiddens instead of the input variable x.
        if self.hiddens_hook is not None:
            self.hiddens = self.hiddens_hook[1]
        else:
            self.hiddens = hidden_activation(T.dot(x, W) + b1)
        # make the reconstruction (generated) from the hiddens
        self.recon_predict = visible_activation(T.dot(self.hiddens, W.T) + b0)
        # now compile the predict function accordingly - if it used x or hiddens as the input.
        if self.hiddens_hook is not None:
            self.f_predict = function(inputs=[self.hiddens], outputs=self.recon_predict)
        else:
            self.f_predict = function(inputs=[x], outputs=self.recon_predict)

As you can see, this requires putting in some checks for each hook possibility. However, the modularity this approach offers is worth it, as you will see in the next tutorial. With hooks, you can connect models together in novel ways just like Lego pieces - setting the inputs or hiddens of one model as the outputs of another!

605

Models connected via hooks.

Finally, to make modularity between models easier, you should define one more method to access the hidden units:

def get_hiddens(self):
    return self.hiddens

That's it! Now, to put everything together our flexible, modular model looks like:

import theano.tensor as T
from opendeep import function
from opendeep.models.model import Model
from opendeep.utils.nnet import get_weights_uniform, get_bias
from opendeep.utils.noise import salt_and_pepper
from opendeep.utils.activation import get_activation_function
from opendeep.utils.cost import get_cost_function

# create our class initialization!
class DenoisingAutoencoder(Model):
    """
    A denoising autoencoder will corrupt an input (add noise) and try to reconstruct it.
    """
    # Best practice to define all the default parameters up here
    # Provide comments when giving parameters so people know what they are for!
    _defaults = {
        "input_size": 28*28,  # dimensionality of input - works for MNIST
        "hidden_size": 1000,  # number of hidden units
        "corruption_level": 0.4,  # how much noise to add to the input
        "hidden_activation": 'tanh',  # the nonlinearity to apply to hidden units
        "visible_activation": 'sigmoid',  # the nonlinearity to apply to visible units
        "cost_function": 'binary_crossentropy'  # the cost function to use during training
    }
    def __init__(self, config=None, defaults=_defaults,
                 inputs_hook=None, hiddens_hook=None, params_hook=None,
                 input_size=None, hidden_size=None, corruption_level=None,
                 hidden_activation=None, visible_activation=None, cost_function=None):
        # Now, initialize with Model class to combine config and defaults!
        # Here, defaults is defined via a dictionary. However, you could also
        # pass a filename to a JSON or YAML file with the same format.
        super(DenoisingAutoencoder, self).__init__(
            **{arg: val for (arg, val) in locals().iteritems() if arg is not 'self'}
        )
        # Any parameter from the 'config' will overwrite the 'defaults' dictionary, which will be overwritten if the
        # parameter is passed directly to __init__.
        # These parameters are now accessible from the 'self' variable!

        # Define model hyperparameters
        # deal with the inputs_hook and hiddens_hook for the size parameters!
        # if the hook exists, grab the size from the first element of the tuple.
        if self.inputs_hook is not None:
            assert len(self.inputs_hook) == 2, "Was expecting inputs_hook to be a tuple."
            self.input_size = inputs_hook[0]

        if self.hiddens_hook is not None:
            assert len(self.hiddens_hook) == 2, "was expecting hiddens_hook to be a tuple."
            self.hidden_size = hiddens_hook[0]


        # use the helper methods to grab appropriate activation functions from names!
        hidden_activation  = get_activation_function(self.hidden_activation)
        visible_activation = get_activation_function(self.visible_activation)

        # do the same for the cost function
        cost_function = get_cost_function(self.cost_function)

        # Now, define the symbolic input to the model (Theano)
        # We use a matrix rather than a vector so that minibatch processing can be done in parallel.
        # Make sure to deal with 'inputs_hook' if it exists!
        if self.inputs_hook is not None:
            # grab the new input variable from the inputs_hook tuple
            x = self.inputs_hook[1]
        else:
            x = T.fmatrix("X")
        self.inputs = [x]

        # Build the model's parameters - a weight matrix and two bias vectors
        # Make sure to deal with 'params_hook' if it exists!
        if self.params_hook:
            # check to see if it contains the three necessary variables
            assert len(self.params_hook) == 3, "Not correct number of params to DAE, needs 3!"
            W, b0, b1 = self.params_hook
        else:
            W  = get_weights_uniform(shape=(self.input_size, self.hidden_size), name="W")
            b0 = get_bias(shape=self.input_size, name="b0")
            b1 = get_bias(shape=self.hidden_size, name="b1")
        self.params = [W, b0, b1]

        # Perform the computation for a denoising autoencoder!
        # first, add noise (corrupt) the input
        corrupted_input = salt_and_pepper(input=x, corruption_level=self.corruption_level)
        # next, compute the hidden layer given the inputs (the encoding function)
        # We don't need to worry about hiddens_hook during training, because we can't
        # compute a cost without having the input!
        # hiddens_hook is more for the predict function and linking methods below.
        hiddens = hidden_activation(T.dot(corrupted_input, W) + b1)
        # finally, create the reconstruction from the hidden layer (we tie the weights with W.T)
        reconstruction = visible_activation(T.dot(hiddens, W.T) + b0)
        # the training cost is reconstruction error
        self.train_cost = cost_function(output=reconstruction, target=x)

        # Compile everything into a Theano function for prediction!
        # When using real-world data in predictions, we wouldn't corrupt the input first.
        # Therefore, create another version of the hiddens and reconstruction without adding the noise.
        # Here is where we would handle hiddens_hook because this is a generative model!
        # For the predict function, it would take in the hiddens instead of the input variable x.
        if self.hiddens_hook is not None:
            self.hiddens = self.hiddens_hook[1]
        else:
            self.hiddens = hidden_activation(T.dot(x, W) + b1)
        # make the reconstruction (generated) from the hiddens
        self.recon_predict = visible_activation(T.dot(self.hiddens, W.T) + b0)
        # now compile the predict function accordingly - if it used x or hiddens as the input.
        if self.hiddens_hook is not None:
            self.f_predict = function(inputs=[self.hiddens], outputs=self.recon_predict)
        else:
            self.f_predict = function(inputs=[x], outputs=self.recon_predict)

    def get_inputs(self):
        return self.inputs

    def get_hiddens(self):
        return self.hiddens

    def get_outputs(self):
        return self.recon_predict

    def predict(self, input):
        return self.f_predict(input)

    def get_params(self):
        return self.params

    def get_train_cost(self):
        return self.train_cost

    def save_args(self, args_file="dae_config.pkl"):
        super(DenoisingAutoencoder, self).save_args(args_file)

Whew, you made it! Not too difficult, but you can see how powerful this makes your model. You are now well-equipped to create awesome new research and contribute models!

300