Versatile video coding (VVC) will be the next generation video coding standard, which is expected to replace HEVC in CE devices, such as tablets, smartphones, and TV sets beyond 2020. The new standard will still be based on transform, quantization, and entropy coding, but a multiple transform selection scheme has been proposed, involving three different types of 2-D Discrete Sine/Cosine transforms (DCT-II, DCT-VIII, and DST-VII), and the transform unit sizes range from 4×4 to 64×64. To handle the computational complexity of these algorithms, it is useful to explore hardware solutions that could be employed as accelerators. In this paper, a high performance architecture to implement the aforementioned 2-D transform types for 4×4, 8×8, 16×16, and 32×32 sizes is proposed. The design has been synthesized for low, medium, and high-end FPGA chips, being able to process up to 23 fps@3840×2160 for 32×32 transform sizes and up to 86 fps@3840×2160 for pictures containing an even distribution of the four block sizes. Moreover, these performance results have been obtained with a moderate consumption of hardware resources.