Cointime

Download App
iOS & Android

DeepSeek releases Prover-V2 model with 671 billion parameters

DeepSeek today released a new model named DeepSeek-Prover-V2-671B on the AI open source community Hugging Face. It is reported that DeepSeek-Prover-V2-671B uses a more efficient safetensors file format and supports multiple calculation precisions, making it easier and more resource-efficient to train and deploy models faster, with 671 billion parameters, or an upgraded version of the Prover-V1.5 mathematical model released last year. In terms of model architecture, the model uses the DeepSeek-V3 architecture, adopts the MoE (Mixture of Experts) mode, has 61 Transformer layers, and a 7168-dimensional hidden layer. It also supports ultra-long contexts, with a maximum position embedding of 163,800, making it capable of handling complex mathematical proofs. It also uses FP8 quantization to reduce the model size and improve inference efficiency through quantization technology. (Jinse)

Comments

All Comments

Recommended for you