Getting Started

Getting Started

FeedbackNets is a Julia package based on Flux. If you are new to Julia, there are great learning resources here and the documentation is helpful too. In order to get to know Flux, have a look at their website and documentation.

Installation

The package can be installed using Pkg.add()

using Pkg
Pkg.add("FeedbackNets")

or using the REPL shorthand

] add FeedbackNets

The package depends on Flux. CuArrays is required for GPU support.

Basic Usage

Once the package is installed, you can access it with Julia's package manager:

using FeedbackNets

Typically, you'll want to load Flux as well for its network layers:

using Flux

In Flux, you would build a (feedforward) deep network by concatenating layers in a Chain. For example, the following code generates a two-layer network that maps 10 input units on 20 hidden units (with ReLU-nonlinearity) and maps these to 2 output units:

net = Chain(
    Dense(10, 20, relu),
    Dense(20, 2)
)

This network can be applied to an input like any function:

x = randn(10)
y = net(x)

In order to construct a deep network with feedback, you can use a FeedbackChain, similar to the standard Flux Chain. The difference between a normal Chain and a FeedbackChain is that the latter knows how to treat two specific types of layers: Mergers and Splitters.

Imagine that in the network above, we wanted to provide a feedback signal from the two-unit output layer and change activations in the hidden layer based on it. This requires two steps: first we need to retain the value of that layer, second we need to project it back to the hidden layer (e.g., through another Dense layer) and add it to the activations there.

The first part is handled by a Splitter. Essentially, whenever the FeedbackChain encounters a Splitter, it saves the output of the previous layer to a dictionary. This way, it can be reused in the next timestep. The second part is handled by a Merger. This layer looks up the value that the Splitter saved to the dictionary, applies some operation to it (in our case, the Dense layer) and merges the result into the forward pass (in our case, by addition):

net = FeedbackChain(
    Dense(10, 20, relu),
    Merger("split1", Dense(2, 20), +),
    Dense(20, 2),
    Splitter("split1")
)

Note that the name "split1" is used by both Merger and Splitter. This is how the Merger knows which value from the state dictionary to take. But what happens during the first feedforward pass? The network has not yet encountered the Splitter, so how does the Merger get its value? When a FeedbackChain is applied to an input, it expects to get a dictionary as well, which the user needs to generate for the first timestep. The FeedbackChain returns the updated dictionary as well as the output of the last layer.

state = Dict("split1" => zeros(2))
x = randn(10)
state, out = net(state, x)

If the user does not want to handle the state manually, they can wrap the net in a Flux Recur, essentially treating the whole network like on recurrent cell:

using Flux: Recur
net = Recur(net, state)
output = net.([x, x, x])