Post

Vitis HLS for Llama2 Acceleration (Part 0)

Vitis HLS for Llama2 Acceleration (Part 0)

Plan

  • ✅ Global configuration header file
  • ✅ Divide the llama2 decoder layer into 2 regions
  • ❌ Analyze the llama2 decoder layer dataflow
  • ❌ Convert original software implementation to vitis hls hardware implementation
  • ❌ Compare allo generated hls code
  • ❌ Measure the speedup
  • ❌ Integrate actual llama2 large language model

Global Configuration Header File

1
2
3
4
5
6
7
#define HIDDEN_SIZE         4096
#define INTERMEDIATE_SIZE   11008
#define NUM_ATTENTION_HEADS 32
#define NUM_KEY_VALUE_HEADS 32

#define NUM_HIDDEN_LAYERS   32
#define VOCAB_SIZE          32000

References

This post is licensed under CC BY 4.0 by the author.