{"metadata":{"image":[],"title":"","description":""},"api":{"url":"","auth":"required","params":[],"results":{"codes":[]}},"next":{"description":"","pages":[]},"title":"Tutorial: First Steps","type":"basic","slug":"tutorial-first-steps","excerpt":"Here you will train an existing model (denoising autoencoder) on the standard MNIST dataset.","body":"To get up to speed on deep learning, check out a blog post here: [Deep Learning 101](http://markus.com/deep-learning-101).\n\nI also recommend setting up [Theano to use the GPU](http://deeplearning.net/software/theano/tutorial/using_gpu.html) to vastly reduce training time.\n\nOpenDeep relies on the creation of three classes:\n* Dataset: you will use a dataset object to act as an interface to whatever data you train with. You can use standard datasets provided, or create your own from files or even in-memory if you pass arrays from other packages like Numpy.\n\n* Model: the model defines the computation you want to perform.\n\n* Optimizer: an optimizer takes a model and a dataset, and trains the model's parameters using examples drawn from the dataset. This is a logically separate object to give flexibility to training models as well as the ability to continually train models based on new data.\n\n#Example code\nLet's say you want to train a Denoising Autoencoder on the standard MNIST handwritten digit dataset. You can get started in just a few lines of code:\n[block:code]\n{\n \"codes\": [\n {\n \"code\": \"# standard libraries\\nimport logging\\n# third-party imports\\nfrom opendeep.log.logger import config_root_logger\\nimport opendeep.data.dataset as datasets\\nfrom opendeep.data.standard_datasets.image.mnist import MNIST\\nfrom opendeep.models.single_layer.autoencoder import DenoisingAutoencoder\\nfrom opendeep.optimization.adadelta import AdaDelta\\n\\n# grab the logger to record our progress\\nlog = logging.getLogger(__name__)\\n# set up the logging to display to std.out and files.\\nconfig_root_logger()\\nlog.info(\\\"Creating a new Denoising Autoencoder\\\")\\n\\n# grab the MNIST dataset\\nmnist = MNIST()\\n\\n# define some model configuration parameters\\nconfig = {\\n \\\"input_size\\\": 28*28, # dimensions of the MNIST images\\n \\\"hidden_size\\\": 1500 # number of hidden units - generally bigger than input size\\n}\\n# create the denoising autoencoder\\ndae = DenoisingAutoencoder(config)\\n\\n# create the optimizer to train the denoising autoencoder\\n# AdaDelta is normally a good generic optimizer\\noptimizer = AdaDelta(dae, mnist)\\n# train the model!\\noptimizer.train()\\n\\n# test the trained model and save some reconstruction images\\nn_examples = 100\\n# grab 100 test examples\\ntest_xs = mnist.getDataByIndices(indices=range(n_examples), subset=datasets.TEST)\\n# test and save the images\\ndae.create_reconstruction_image(test_xs)\",\n \"language\": \"python\"\n }\n ]\n}\n[/block]\nAfter 318 training epochs with AdaDelta, this is what the reconstructions look like on test images:\n[block:image]\n{\n \"images\": [\n {\n \"image\": [\n \"https://files.readme.io/Q6MZk4SeiCI5s9Bg1PQe_reconstruction.png\",\n \"reconstruction.png\",\n \"840\",\n \"280\",\n \"#cecece\",\n \"\"\n ],\n \"caption\": \"Left: original test images.\\nCenter: corrupted (noisy) images.\\nRight: reconstructed images (output).\"\n }\n ]\n}\n[/block]\n#Passing data from Numpy/Scipy/Pandas/Array\nIf you want to use your own data for training/validation/testing, you can pass any array-like object (it gets cast to a numpy array in the code) to a Dataset like so:\n[block:code]\n{\n \"codes\": [\n {\n \"code\": \"# imports\\nfrom opendeep.data.dataset import MemoryDataset\\nimport numpy\\n\\n# create some fake random data to demonstrate creating a MemoryDataset\\n# train set\\nfake_train_X = numpy.random.uniform(0, 1, size=(100, 5))\\nfake_train_Y = numpy.random.binomial(n=1, p=0.5, size=100)\\n# valid set\\nfake_valid_X = numpy.random.uniform(0, 1, size=(30, 5))\\nfake_valid_Y = numpy.random.binomial(n=1, p=0.5, size=30)\\n# test set (showing you can mix and match the types of inputs - as long as they can be cast to numpy arrays\\nfake_test_X = [[0.1, 0.2, 0.3, 0.4, 0.5],\\n [0.9, 0.8, 0.7, 0.6, 0.5]]\\nfake_test_Y = [0, 1]\\n\\n# create the dataset!\\n# note that everything except for train_X is optional. that would be bare-minimum for an unsupervised model.\\ndata = MemoryDataset(train_X=fake_train_X, train_Y=fake_train_Y,\\n valid_X=fake_valid_X, valid_Y=fake_valid_Y,\\n test_X=fake_test_X, test_Y=fake_test_Y)\\n# now you can use the dataset normally when creating an optimizer like other tutorials show!\",\n \"language\": \"python\"\n }\n ]\n}\n[/block]\n\n#Summary\nCongrats, you just:\n* set up a dataset (MNIST or an array from memory)\n* instantiated a denoising autoencoder model with some configurations\n* trained it with an AdaDelta optimizer\n* and predicted some outputs given inputs (and saved them as an image)!\n[block:image]\n{\n \"images\": [\n {\n \"image\": [\n \"https://files.readme.io/CPkcSw6IRkdzWekfB5Kl_the_carlton.gif\",\n \"the_carlton.gif\",\n \"480\",\n \"340\",\n \"#4e3126\",\n \"\"\n ]\n }\n ]\n}\n[/block]","updates":[],"order":0,"isReference":false,"hidden":false,"sync_unique":"","link_url":"","link_external":false,"_id":"5507638ffa89210d00c8c987","createdAt":"2015-03-16T23:13:19.964Z","githubsync":"","category":{"sync":{"isSync":false,"url":""},"pages":["550556cdc16b21170080c646","55055707b9a7a0190036697c","5507638ffa89210d00c8c987","5507ebef6ac1620d001b9405","5508995e0f146f3500b031ae","552ada773f29c30d00619cbc"],"title":"Tutorials","slug":"tutorials","order":1,"from_sync":false,"reference":false,"_id":"550556a4728deb23005ec0f0","__v":6,"version":"55053eeb84ad8c0d005b0a62","createdAt":"2015-03-15T09:53:40.258Z","project":"5503ea178c5e913700362c70"},"__v":26,"project":"5503ea178c5e913700362c70","user":"5503e897e508a017002013bd","version":{"version":"0.0.5","version_clean":"0.0.5","codename":"","is_stable":false,"is_beta":true,"is_hidden":false,"is_deprecated":false,"categories":["55053eec84ad8c0d005b0a63","550556a4728deb23005ec0f0"],"_id":"55053eeb84ad8c0d005b0a62","__v":2,"createdAt":"2015-03-15T08:12:27.786Z","forked_from":"5503ea188c5e913700362c73","project":"5503ea178c5e913700362c70","releaseDate":"2015-03-15T08:12:27.786Z"}}
Tutorial: First Steps
Here you will train an existing model (denoising autoencoder) on the standard MNIST dataset.