TransformerEncoder

public struct TransformerEncoder<Element, Device> : LayerType, Codable where Element : RandomizableType, Device : DeviceType

Transformer encoder sequencing positional encoding and token embedding and multiple transformer encoder layers, as introduced by Attention Is All You Need.


                    
                    
                    encoderLayers

Undocumented

Declaration

Swift

public var encoderLayers: [TransformerEncoderBlock<Element, Device>]


                    
                    
                    parameters

Declaration

Swift

public var parameters: [Tensor<Element, Device>] { get }


                    
                    
                    parameterPaths

Declaration

Swift

public var parameterPaths: [WritableKeyPath<`Self`, Tensor<Element, Device>>] { get }


                    
                    
                    init(layerCount:heads:keyDim:valueDim:modelDim:forwardDim:dropout:)

Creates a transformer encoder sequencing positional encoding and token embedding and multiple transformer encoder layers, as introduced by Attention Is All You Need.

Declaration

Swift

public init(layerCount: Int, heads: Int, keyDim: Int, valueDim: Int, modelDim: Int, forwardDim: Int, dropout: Float)

Parameters

`vocabSize`	Number of distinct tokens that can occur in input
`layerCount`	Number of transformer encoder layers
`heads`	Number of attention heads in each encoder layer
`keyDim`	Size of keys in multi-head attention layers
`valueDim`	Size of values in multi-head attention layers
`modelDim`	Size of embedding vectors as well as hidden layer activations and outputs
`forwardDim`	Size of hidden layer activations within pointwise feed forward layers
`dropout`	Rate of dropout applied within pointwise feed forward and multi-head attention layers


                    
                    
                    callAsFunction(_:)

Forwards the given batch of token sequences through the encoder.

Declaration

Swift

public func callAsFunction(_ inputs: (input: Tensor<Element, Device>, sequenceLengths: [Int])) -> Tensor<Element, Device>

Parameters


                                inputs

Token sequences

Return Value

Batch of encoder outputs with shape [inputs.count, maxLen, hiddenSize]