Replies: 7 comments 5 replies
-
@athre0z @flobernd This is really nice to hear! Don't worry about assistance. Also I agree that dependencies are potentially a problem. Thanks for pointing this out again. What were the main complaints from users about the dependencies, if I might ask? Because there is neither time nor enough maintainers currently to do this anyways. I would let this idea sit here for a while. And get more feedback or someone who is motivated. We have so many open issues about out dated x86, I think it would be worth the pain. Especially, since Zydis is faster and smaller than the x86 module (at least the one time I checked it casually). So users would get quite a lot for the annoyance of a dependency. In our case users who don't need x86, can disable the module completely. And don't deal with it at all if they don't want to. |
Beta Was this translation helpful? Give feedback.
-
What benefit is there to have Zydis in Capstone for x86 when the user could just use Zydis? Capstone would also make things slower due to copying the data and most likely allocating memory, so that is a downside. I know it sounds nice to have all sorts of architectures in theory but in practice all those architectures typically don't have a lot in common and no one really needs all those architectures at the same time. |
Beta Was this translation helpful? Give feedback.
-
From a user perspective (https://github.com/intel/processwatch), it seems to me that there's not much advantage in Capstone simply calling Zydis for x86 instructions. If I want to call Zydis, I simply would! It's a very easy dependency to build and use; in fact, my project did use it as a submodule for quite some time before I wanted to support ARM, and since Capstone doesn't support later x86 instructions, I'll re-add it back to my repo (but obviously use Capstone for ARM and potentially other architectures). The way I see it, Capstone calling Zydis would:
|
Beta Was this translation helpful? Give feedback.
-
Hi, we use Capstone in Pwndbg. I think we don't mind if Capstone switches to Zydis as long as the Capstone API stays the same. And by that, I also mean all the various detailed information that Capstone provides about instructions (instruction group, operands, read/write accesses etc) as we leverage that information heavily. |
Beta Was this translation helpful? Give feedback.
-
I agree, after all capstone is not going to put a lot of effort into x86, it should be left to the professional |
Beta Was this translation helpful? Give feedback.
-
I'm going to necro this thread because I think it's very important for Capstone to make substantial improvements to the x86 decoder that are huge obstacles to overcome using the current LLVM tablegen importer. I am currently porting the instruction decoder for Dyninst to improve our support for x86 (it's internal decoder is already far behind what Capstone can do). Dyninst is used by RHEL's SystemTap utility, so it's installed on a few machines (/sarcasm). We are also a core component of several HPC performance analysis tools (e.g., HPCToolkit) which requires supporting a wide variety of architectures simultaneously. Currently, we support x86, x86_64, ARMv8, ARMv9, ppc64le, and AMDGPU. We are also one of only two teams working on binary instrumentation for AMD GPUs and are planning to upstream our decoder into Capstone. That would make Capstone the only AMD GPU decoder around outside of AMD's SDK. That's a huge win for FOSS since NVIDIA won't open-source their ISA and gtPIN (for Intel XE) isn't the easiest thing to use. Currently, we are also the only group working on binary instrumentation for RISC-V for which we use Capstone. What does all of this chest-thumping mean? The Dyninst team doesn't have the ability to fix it's x86 decoder and still provide other features. This is exactly Capstone's situation. I'm not waxing philosophical here. Interestingly, right about the time of this thread Wine announced they are now using Capstone for their decoding. That's huge, and really emphasizes the need for expanding x86 support. My thoughts on the items discussed in this thread so far: If you want x86, just use Zydis instead of Capstone.I strongly considered using xed for x86 and Capstone for everything else in Dyninst. I chose xed because we could also use it to upgrade our x86 binary rewriting capabilities. I, like many others here, don't want to add more dependencies than are necessary, so I really don't want to do this. Are we honestly considering forcing users like Wine to switch to Zydis? I think that's a good way for Capstone to fade into uselessness. Adding Zydis as a dependency to Capstone is too big a burden on users.It was mentioned that C and C++ have no standardized ecosystem for package management. Indeed, there is a veritable zoo of package managers centered on C++ (but support many languages). I'm the maintainer of the Boost spack package, so I know how much work it can be to keep those recipes updated. That said, offering Capstone through PMs like conan and vcpkg drastically increases the surface area for new users and decreases the friction of getting them onboard. I definitely want to do this for Dyninst so users have an even easier time handling our dependencies. I think pasting in Zydis via their amalgamated distribution and then doing separate Capstone tarballs with and without deps is a good balance for the reasons described here. If it works for the Firefox folks, I see no reason it shouldn't work well for Capstone. I think Capstone should also offer recipes for package managers like I discussed above. Indirection through Zydis adds overhead to Capstone (ref).I absolutely agree this is something we must be diligent about. Zydis does not dynamically allocate memory (ref), but Capstone's current implementation for x86 uses at least one. This could be seen as an improvement, so long as it could be measured. Capstone uses a "two-phase" process in which it decodes in the backend and then copies data into the front end (e.g., X86_get_insn_id). I see no difference between this and copying the data from Zydis' data structures- assuming Zydis does not also do a two-phase process. There is an impedance mismatch between Capstone and Zydis that will make development of Capstone harder (ref)I want to address each of Ben's points in turn.
Perhaps I've not fully understood the idea, but Capstone's current interface isn't uniform across all architectures, and I'm not sure how it could be made to be so. In fact, there need to be at least a few big changes to Capstone's current x86 interface. For example, there is no representation for implicit memory operands like those in x86 call instructions (I have an upcoming issue and PR about this). There also needs to be an ABI break in
I will withold comment until such items have been shown to exist.
As I noted above, I agree with many folks here that one dependency is better than N (N >= 2).
I am a massive fan of FOSS and having choices in what software I want to use. In this case, I will push back slightly because Dyninst can be used as an x86 disassembler, but no one should because it is very incomplete. I want to replace it with something better, thus reducing the number of decoders by one. I see the same with Capstone. Users still have large projects like xed, bddisasm, Ghidra, and IDA Pro. The ecosystem for other architectures is microscopic in relation to x86. For example, I'm only aware of four decoders for ppc64le: Dyninst, binutils, llvm-objdump, and Capstone. I intend to replace Dyninst's with Capstone's. |
Beta Was this translation helpful? Give feedback.
-
We keep the discussion open for a little longer until the Beta is released (hopefully around November). Then make an overview table of pro's and con's and ask the biggest Capstone projects for their vote. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
We had the idea to exchange the current x86 module with the Zydis disassembler.
Mainly because it is unlikely we will be able to update x86, because it has such a unique place in the LLVM world. So we can't use our Auto-Sync updater for it currently (See: capstone-engine/llvm-capstone#13).
The details are already describe in #2503. See below
Copy of relevant messages
Rot127
athre0z
Beta Was this translation helpful? Give feedback.
All reactions