Replies: 17 comments 36 replies
-
I've always been ready to move to yaml 😄 , but since this breaks the daily workflow of a lot of people:
|
Beta Was this translation helpful? Give feedback.
-
I think this is an important point. If possible with not too much effort, we should enable the full bandwidth of yaml input first and then do the deprecation step so that users don't get annoyed from adapting their input files/scripts. |
Beta Was this translation helpful? Give feedback.
-
I appreciate the recent effort to add the |
Beta Was this translation helpful? Give feedback.
-
Since a possible deprecation of the dat file format is a major break in backwards compatibility, I'll share my thoughts on the topic (in no particular order):
|
Beta Was this translation helpful? Give feedback.
-
Personally, I don't like the idea of deprecating the data file in such a short time, as it is part of every developer's workflow. I wouldn't assume that all developers are ready to make such a change right now, as I don't know their personal schedules and current tasks. I would propose for a longer period of grace. I haven't tried yaml as input for 4C, so I don't know its benefits for daily workflow. I think it would be nice if this could be shown to other developers (maybe in the next developer meeting?). |
Beta Was this translation helpful? Give feedback.
-
There seems to be a lot of confusion regarding different aspects of the input and timeline. As I already mentioned (also pointed out by @eulovi), I propose to do a TGM-like presentation (I guess the presenter should be one of the driving forces) with all the new input features, as most people are not familiar with them. I think this is particularly interesting for people who actually create input files, i.e., users, as this is a first glance at how the changes could improve their daily research. Not only is this information important to decide to switch workflows (for each user individually), but it also paves the way for an informed discussion of the next steps and possible improvements. |
Beta Was this translation helpful? Give feedback.
-
Coming from someone who worked with a lot of dat files during the recent clean-up of the input mechanism: The fact that dat is not a standardized format makes such changes very time-consuming. If the files were yaml, I would have simply written a python script to replace certain keys, values, a.s.o; dat files may have extra spaces, comments, whatsoever such that simple regex searches do not always work. |
Beta Was this translation helpful? Give feedback.
-
Thanks to the recent work from @sebproell and @lauraengelhardt, using 4C with yaml files just became even more user-friendly. Here is a demo for some recent sections for which we are currently enabling validation (not in main yet, wip): Note: In dept tutorial will come in the TGM/Meeting schema_demo.mp4 |
Beta Was this translation helpful? Give feedback.
-
We have made good progress to support standard input files (notably yaml) in the last few weeks. Also, @gilrrei and I will give a small TGM on the improvements in the input in the next weeks. With this progress in mind, I would like to propose a concrete plan for the deprecation of dat format:
Remember that deprecation does not mean that a feature is unusable. We are simply thinking forward and expressing the intent to replace a feature with a superior alternative. I see no point in misleading people into thinking that using .dat is a good idea. Not informing users that they should use .yaml/json for their input seems irresponsible at this point. I am open to discussing the time frame for eventual removal, but deprecation should happen basically now. |
Beta Was this translation helpful? Give feedback.
-
RecapDeveloper point of view:
User point of view:
Thus, we think it should be deleted in the long run. Note: dat deprecation and deletion is independent of the changes to the mesh file #60. As there is no timeline for that, we do not want to be tied to this separate (but important!) project. Work on #60 is even possible in parallel to the points we list below. Together with @lauraengelhardt and @gilrrei, we propose the following roadmap for dat file deprecation and deletion: Phase 1 - Deprecation (now)
Phase 2 - Adaptation
Note: All old dat files will continue to work until the end of this phase. Phase 3 - Deletion (tbd)
Note that starting at phase 2 we can tackle some of the planned additions. From our point of view, it is important to do phase 1 now. Phase 2 and 3 do not have a strict timeline. Nevertheless, there is overhead in maintaining code that is bound to be deleted. This is a concrete proposal for action. Feel free to comment on specific points and add anything we might have forgotten. |
Beta Was this translation helpful? Give feedback.
-
We've created a project with repo FourCIPP (coauthored by chatgpt): The FourCIPP repository will hold a Python parser for 4C input data, designed to facilitate the transition to YAML for all 4C Python projects. This tool serves as the backend for 4C input Python projects, providing a streamlined approach to data handling. We'll add features step by step, but the goal is, of course, to handle everything 4C input. Once we have a good working version we'll provide more information, in the meantime, everyone who's interested in this tool can participate, from ideas to coding, contributions are welcome :) |
Beta Was this translation helpful? Give feedback.
-
In order to discuss especially Phase 2 (details, time line) of the Recap (#263 (comment)) and clarify potential misunderstandings, here is an invitation to a Zoom meeting on April 1st, 15:30 - 16:30: https://tum-conf.zoom-x.de/j/69826674614?pwd=T2gYCfeW0RBHH1Avsli4mcLs6JXLOo.1 Please feel free to join. Meeting notes will be provided afterwards here. |
Beta Was this translation helpful? Give feedback.
-
As far as I judge this thread, we have not reached an agreement yet, if we want to deprecate the dat format. There are still some important points to answer/discuss:
Let me please put a perspective on why I am not as enthusiastic as some other people with switching to yaml and deprecating the dat format at the current pace. From everything I have seen so far, in the current state yaml helps with defining parameters in the input file. I have personally never had any relevant issues with that for the dat format, sure it is nicer if I automatically get suggestions and see if I wrongly put in the time step as a string, but in the end I still have to actually run the input file with 4C to see if everything was correct as there are a lot of consistency checks that can not be realized with the yaml checks. Also, these advantages don't work with the approach taken with CubitPy and MeshPy (I will talk to @davidrudlstorfer as he is one of the few who know the yaml input and MeshPy and listen to his take). My current point of view is: "yaml has some nice features but it does not help me with my workflow. The advantages we get from using yaml in it's current state don't outweight the drawback of updating all our tools and scripts. There are other more pressing needs." I surveyed a few 4C users at IMCS (mainly MeshPy and CubitPy users) and the feedback was more or less the same. I think such a large strategic change does not solely depend on the technical facts but on many other aspects as well, and should at least be discussed in the maintainer meeting. I am curious on the answers and hope to get a constructive discussion going. Also, I think we have to improve our communication in discussions like these and remember that we are a community and we should consider the wishes and needs of everyone in that community. I understand that the people behind the proposed changes are very enthusiastic and want to make progress, but I would also ask for some thoughtfulness for the people who only marginally benefit from the proposed changes, but need to adapt their whole input workflow that has been developed over a long time. Changes to such fundamental parts of 4C are bound to bring up different opinions and as we have seen in the past, a (virtual) discussion in person usually also helps to better understand conflicting opinions. Thus, I want to thank @georghammerl for setting up the meeting and I am looking forward to discussing future solutions for our input and the possible deprecation of the dat format there. Finally, regarding |
Beta Was this translation helpful? Give feedback.
-
To determine which points are open for discussion in the meeting, we (@gilrrei, @sebproell) have summarized the key points we believe we all agree on. If anyone disagrees with any of the points listed below, please provide your comments and reasoning.
If you disagree with any of the points mentioned above, please provide your reasons below. |
Beta Was this translation helpful? Give feedback.
-
After discussion with @isteinbrecher and @mayrmt, I would like to suggest a modified roadmap Phase 0 - Assess the technical capabilities of yaml
Phase 1 - yaml is the future, dat will be phased out; Both yaml & dat are fully operational (no forced conversion to yaml)
Note: Steps 2...6 can be worked on in parallel. Phase 2 - Deletion of dat support (tbd)
I am looking forward whether this roadmap serves as a compromise achieving the same outcome as suggested in #263 (comment) . So for now, the agenda for our meeting on April 1st is to discuss the content/order and a possible time frame of this proposed roadmap. Thoughts on how to efficiently update a Python based workflow are welcome and may be exchanged. |
Beta Was this translation helpful? Give feedback.
-
Here are the timings for
Thanks, @mayrmt, for providing the input file. After spending some time applying all input changes that happened within one year to the input file, I got the input file running with the latest 4C version (commit sha:
Comparison dat to yaml
There is no noticeable difference in file size because the dat file had some extra whitespace due to formatting, e.g. some double spaces instead of one. Reading time
The runtime overhead for yaml is between 9 and 11 %. Note that the simulation crashes some time after reading the input with the error
|
Beta Was this translation helpful? Give feedback.
-
Dat is bound to be deprecated. I created a milestone to track the progress: https://github.com/4C-multiphysics/4C/milestone/4 I will transfer the roadmap into a large issue for this milestone (see #598). |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I want to use this thread to discuss if, how and when to deprecate the dat format. Depending on the outcome, we can derive the concrete tasks in an issue.
Everybody uses input files, so tagging all @4C-multiphysics/developers and @gilrrei for QUEENS. Please voice your ideas or concerns. Here are ideas that came up in discussion with a few of you.
Why should we deprecate the dat format?
Inlined dat-style strings
When we switch to yaml, we might want to keep the option to "inline" a yaml object into a dat-style string. What I mean is the following:
This yaml object
might be entered equivalently as an inlined dat string
The motivation here is mainly to save vertical space (and a bit of file size).
When to switch
From an implementation side, nothing is really missing. Technically, we could do it right now because we can read yaml, although we only support the "inlined dat" style mentioned above. Thus, I would consider supporting the object yaml style first so we can convert 4C input files to yaml files with a useful structure right away. Second, we might want to have the schema file describing these objects ready before we fully switch so it is a pleasurable experience for everybody to work with yaml :)
We might need to consider external scripts and grant a (short) transition period.
Related
Related to #113, which is the driving force and obvious benefit of yaml over dat in terms of usability.
This discussion is tangentially related to #60, although it does not really matter if we read our mesh from dat or yaml, since both are equally unsuited for this kind of information.
Beta Was this translation helpful? Give feedback.
All reactions