Is the pre training part trained using the original codeT5? Or should we add FiD and set the context to 0?