Replies: 5 comments 2 replies
-
|
Hello, thank you for your interest in Brain4J! Brain4J currently supports ONNX format for model interoperability (see onnx.proto in brain4j-core). We also have a dedicated brain4j-llm module with architecture adapters (like GPT2Adapter) that handle model loading and inference. Regarding safetensors + config.json support: this is not natively supported yet, but we recognize its value for Hugging Face ecosystem compatibility. If you have a specific model architecture in mind (GPT-2, LLaMA, etc.), please let us know. We could prioritize that adapter. PRs are also welcome if you'd like to contribute! |
Beta Was this translation helpful? Give feedback.
-
Thank you very much for your reply. I think brain4j is a very vibrant deep learning project, especially when I discovered that we have the ability to test model adapters for safetensors, which is a very forward-thinking attempt. I also very much look forward to Java having a place in future large models. Relatively speaking, the TorchScript and ONNX exported from Python seem to only contain the weights, lacking the model structure. I am very much looking forward to us being able to do large model fine-tuning on the Java side in the future, so I am very much looking forward to the combination of safetensors and config.json being able to implement a retrainable model structure on the Java side. Transformers does support ONNX, but we have also encountered some problems. Especially, some large model structures in Transformers are complex and contain dynamic layers, which leads to failed ONNX exports. For example, https://huggingface.co/Aratako/T5Gemma-TTS-2b-2b, this model is very difficult to deal with. I tried several times but failed to export it. If you are willing to try, you can give it a shot. I think the most crucial thing is if we can replicate the logic in the Transformers library that restores the model structure from config.json on the Java side, that would be perfect. Brain4j has already implemented some common layers, but PyTorch seems to have over 100 different layers. I think in the future, Brain4j should support the development of more layers. Also, it would be very helpful if the brain4j math module could support reading and writing common Python data formats like numpy, pickle, hdf5, and imdb. |
Beta Was this translation helpful? Give feedback.
-
|
At the moment, modern LLMs are not supported mainly due to the lack of a RoPE implementation, which is a hard requirement for most recent transformer architectures. In parallel, we are focusing on fixing a few critical issues in the GPU backend. GPU stability and correctness are currently a higher priority than adding new model architectures. Once these foundations are solid, brain4j-llm may expand to support multiple LLM architectures and potentially fine-tuning, but this is not on the immediate roadmap. Regarding Python model/data formats: they are tightly coupled to the Python ecosystem and often rely on implementation details that are not portable across languages. Because of this, direct support is not currently planned. (as of now) |
Beta Was this translation helpful? Give feedback.
-
Brain4j is very impressive and has already implemented many features. It would be a great experience if we could write CUDA kernel functions in Java in the future. Actually, there are already Java, Scala 3, and Kotlin solutions for reading and writing numpy, pickle, and hdf5 files, which we can find and reference. I am quite optimistic about Brain4j, but for future enterprise-level deployment, many more modules need to be developed, such as support for NCCL or Gloo for distributed training, support for mixed precision, support for AOT, support for more tensor operators, and so on. The engineering effort for these is enormous, and it will take a great deal of effort to become a modern deep learning framework with high barriers to entry. |
Beta Was this translation helpful? Give feedback.
-
Could you be more specific about how you would want to write CUDA kernels directly from Java? In the spare time I'm experimenting with a possible future backend that targets CUDA on NVIDIA, Metal on macOS or and OpenCL elsewhere. The idea is to rely on a common kernel language, such as Slang, to avoid rewriting 3 times the same kernels. You are also right about interpretability: there are a few JVM solutions for NumPy and pickle (and possibly HDF5 as well). The main reason why we haven't adopted them so far is the deliberate choice to keep the core JAR small, portable and fully controllable. That said, this doesn't exclude optional or modular integrations on the future. Regarding enterprise level features, I agree that these are essential for a modern framework, but the engineering effort is enormous. Brain4J is currently developed mostly by me and a friend of mine, so progress is much slower than common ML frameworks, which are often developed and mantained by full time developer teams. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I think a crucial step now is to first download the model in safetensors format and the config.json. We then need to implement the interpretation of the config.json to restore the model's architecture and reassign the weights to their corresponding model layers. These are all very critical steps.
Beta Was this translation helpful? Give feedback.
All reactions