PointwiseFeedForward
public struct PointwiseFeedForward<Element, Device> : LayerType, Codable where Element : RandomizableType, Device : DeviceType
Pointwise feed forward layer as introduced in Attention Is All You Need.
The layer sequences a dense layer, GeLU activation, another dense layer and a dropout layer. Furthermore, it has a residual connection and the output is layer normalized.
-
Undocumented
Declaration
Swift
public var dense1: Dense<Element, Device>
-
Undocumented
Declaration
Swift
public var dense2: Dense<Element, Device>
-
Undocumented
Declaration
Swift
public var norm: LayerNorm<Element, Device>
-
Undocumented
Declaration
Swift
public var dropout: Dropout<Element, Device>
-
Declaration
Swift
public var parameters: [Tensor<Element, Device>] { get }
-
Declaration
Swift
public var parameterPaths: [WritableKeyPath<`Self`, Tensor<Element, Device>>] { get }
-
Creates a pointwise forward layer to be used in a transformer as introduced in Attention Is All You Need. The block sequences a dense layer, gelu activation, another dense layer, dropout, a residual connection and layer normalization.
Declaration
Swift
public init(size: Int, hiddenSize: Int, dropoutRate: Float)
Parameters
size
Size of last dimension of inputs and outputs of the block
hiddenSize
Hidden size between dense layers of the block
dropoutRate
Rate, with which dropout is applied between activation and the second dense layer. Can be enabled and disabled using
isDropoutActive
. -
Applies the pointwise feed forward layer to the provided inputs
Declaration
Parameters
inputs
tensor of shape [batch size, sequence length, size]
Return Value
tensor of shape [batch size, sequence length, size]