|
Keraflow
Deep Learning for Python.
|
A Layer is a one-to-one or many-to-one tensor transformer
Both theano and tensorflow are graph based deep learning framework. Let X, W and b be three symbolic tensors. Then Y=WX+b makes Y a new symbolic tensor that equals to W dots X plus b. When there are many tensors in your model, things get messy. The purpose of Layer is to simplify the process of building new tensors from existing tensors. For example:
In Keraflow, an Input layer (line 1) is a special type of layer that could be treated as a symbolic tensor. By feeding a tensor X to a layer (line 4), we got a new symbolic tensor Y, which can then be fed to another layer.
Each layer takes different arguments to initialize. As seen above, a dense layer takes three argument. However, there are some common key word arguments you could pass to every layer, which are defined in Layers' __init__:
name for easy debugging.trainable to False, if you don't want the layer parameters be updated during training process.initial_weights to directly assign the initial values of the layer parameters (this will override the init argument). See Initializations.regularizers to apply regularization penalty on the layer parameters. See Regularizers.constraints to restrict the value of layer parameters when updating them during training process. See Constraints.For details on setting initial_weights, regularizers, and constraints, please see Argument Passing Summarization and the related pages.
A layer can be fed multiple times by different input tensors. Considering the following case:
By feeding the same layer two times, we keep only one W and b in our tensor graph.
A layer might take more than one tensor as input. For example:
The only difference of a multiple input layer is that it takes a list of input tensors as input instead a single tensor. Users could define their own layers to take either single or multiple input tensors. See Implementing Layer Functions
We already see how to obtain a new tensor by feeding a tensor to a layer. However, it would be cumbersome to name all the tensors if we just want to perform a series of operation. Sequential thus provides a syntax sugar for us:
Note that when the first layer of Sequential is an Input layer, it is treated as a tensor (the tensor output by the last layer), i.e. you could feed it to another layer but you can not feed other tensor to it.
When the first layer of Sequential is not an Input layer, it is treated as a normal layer, whether it takes a single tensor or a list of tensors denpend on its first layer.
We've cover what users need to know to use existing layers in Keraflow. For advanced information such as writing customized layer, please refer to Developer-Guide
A typical deep learning model does the follows:
Y from the input tensor X such that Y = f(X), where f is the model with some trainable parameters (e.g. Y=WX+b).Y is from the gold answer Target according to a loss function: loss = L(Y, Target).loss with respect to the trainable parameters: Gw, Gb = grad(loss, W), grad(loss, b)loss according to some optimizing rule (e.g. subtracting the gradient from the parameters W, b = W-GW, b-Gb)We now introduce Model and Sequential to cover these steps.
Let's start from the simple model with one input tensor:
We tell the model that the input tensor is X and the output tensor is Y. In the one-input-one-output case, we could simply use Sequential to avoid naming all the tensors.
Note that Sequential can only be used as a model (able to call fit, predict) when its first layer is an Input layer. Moreover, if there are multiple input tensors or multiple output tensors, we could only use Model:
Now we can specify the loss function and the optimizer. For single output model (including Sequential):
For a model with multiple outputs, we need to specify a loss function for each output channel:
Note that the name of each output channel is the name of the corresponding output layer (output1, output2 in the example). If you feel that writing these names unnecessary, you could also pass a list to loss (see Objectives).
In the examples, we pass strings (e.g. sgd, categorical_crossentropy). These strings are actually alias of predefined optimizers and loss functions. You could also pass customized optimize/loss function instance (see Optimizers, Objectives).
Both Model and Sequential has
X to Target by iteratively updating model parameters.loss given X and TargetY given XBoth fit and evaluate takes X and Target as inputs. As for predict, only X is required.
For Sequential (single input, single output), X, Target takes a single numpy array (or list). For Model, X, Target takes a dictionary/list of numpy array(s) (see Argument Passing Summarization).
You could save a (trained) model to disk and restore it for later use. Simply run
Note that the architecture and the model parameters are stored separately. If you don't want to save model weights, set weight_fname to None.
Keraflow supports two file formats for storing model architectue: json and yaml. Please specify the file extension to switch between these two formats. Note that in addition to arch_fname and weight_fname, save_to_file takes **kwargs and pass them to json.dump or yaml.dump. So, in the above example, the indent argument is to beautify the architecture output.
To bring both flexibly and convince to argument passing in Keraflow, for some arguments of some functions, users can pass a single value, a dictionary, or a list. We summarize them as follows. For more examples, please refer to Initial Weights, Regularizers, and Constraints.
Related arguments:
initial_weightsx, y, sample_weightsRelated arguments:
regularizers, constraintsloss, metricsRelated arguments:
initial_weights, regularizers, constraintssample_weightsmetrics