Hello author, could you please tell me how the training data here is constructed? Is it directly using the prompts in the appendix and dumped by the deepseek-r1 model? Another question: what is the difference between this paper and search-o1 in the non-long text writing part? This paper seems to be a special extension of search-o1 in cultural and creative fields. Is that the case?
Hello author, could you please tell me how the training data here is constructed? Is it directly using the prompts in the appendix and dumped by the deepseek-r1 model? Another question: what is the difference between this paper and search-o1 in the non-long text writing part? This paper seems to be a special extension of search-o1 in cultural and creative fields. Is that the case?