PointwiseFeedForward

public struct PointwiseFeedForward<Element, Device> : LayerType, Codable where Element : RandomizableType, Device : DeviceType

Pointwise feed forward layer as introduced in Attention Is All You Need.

The layer sequences a dense layer, GeLU activation, another dense layer and a dropout layer. Furthermore, it has a residual connection and the output is layer normalized.

  • Undocumented

    Declaration

    Swift

    public var dense1: Dense<Element, Device>
  • Undocumented

    Declaration

    Swift

    public var dense2: Dense<Element, Device>
  • Undocumented

    Declaration

    Swift

    public var norm: LayerNorm<Element, Device>
  • Undocumented

    Declaration

    Swift

    public var dropout: Dropout<Element, Device>
  • Declaration

    Swift

    public var parameters: [Tensor<Element, Device>] { get }
  • Declaration

    Swift

    public var parameterPaths: [WritableKeyPath<`Self`, Tensor<Element, Device>>] { get }
  • Creates a pointwise forward layer to be used in a transformer as introduced in Attention Is All You Need. The block sequences a dense layer, gelu activation, another dense layer, dropout, a residual connection and layer normalization.

    Declaration

    Swift

    public init(size: Int, hiddenSize: Int, dropoutRate: Float)

    Parameters

    size

    Size of last dimension of inputs and outputs of the block

    hiddenSize

    Hidden size between dense layers of the block

    dropoutRate

    Rate, with which dropout is applied between activation and the second dense layer. Can be enabled and disabled using isDropoutActive.

  • Applies the pointwise feed forward layer to the provided inputs

    Declaration

    Swift

    public func callAsFunction(_ inputs: Tensor<Element, Device>) -> Tensor<Element, Device>

    Parameters

    inputs

    tensor of shape [batch size, sequence length, size]

    Return Value

    tensor of shape [batch size, sequence length, size]