Skip to content

Conversation

@Maijin
Copy link
Member

@Maijin Maijin commented Dec 18, 2025

Your checklist for this pull request

  • I've read the guidelines for contributing to this repository.

  • I made sure to follow the project's coding style.

  • I've documented every RZ_API function and struct this PR changes.

  • I've added tests that prove my changes are effective (required for changes to RZ_API).

  • I've updated the Rizin book with the relevant information (if needed).

  • I've used AI tools to generate fully or partially these code changes and I'm sure the changes are not copyrighted by somebody else.

  • Antigravity + Gemini 3 Flash

Detailed description
This PR improves the documentation and provides practical examples for the librz/socket module. The goal is to make the library easier to use for new contributors and developers building tools on top of Rizin.

Key Changes:

  • **Enhanced [librz/socket/README.md]
  • New C Examples (examples/api/socket/)

These have been tested on Linux Debian WSL

@Maijin Maijin changed the title doc(socket): improve README and add examples doc(socket): improve README and add examples #5135 Dec 18, 2025
@codecov
Copy link

codecov bot commented Dec 18, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 47.22%. Comparing base (797fcd0) to head (54d8efd).

Additional details and impacted files

see 11 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 797fcd0...54d8efd. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Maijin Maijin force-pushed the docs/socket-readme branch from 521fa70 to 54d8efd Compare January 7, 2026 04:45
@notxvilka notxvilka requested a review from wargio January 7, 2026 04:45
int len = rz_socket_read(s, (ut8 *)buf, sizeof(buf) - 1);
if (len > 0) {
buf[len] = 0;
printf("Server: Received: %s\n", buf);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add comments where the input and output would need sanitation.
You don't need to implement it, but at least a comment would be nice.
E.g. here printing just the raw buffer is somewhat meh.

## Key Structures

- `RzSocket`: Main structure for network and serial connections. Supports SSL.
- `RzPipe`: Used for communicating with Rizin processes or linked libraries.
Copy link
Member

@Rot127 Rot127 Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please correct me if I am wrong, but doesn't RzPipe only supports commands as input and returns the output?

The sentence (and the other text) here is a little too much high level I'd say. It is not "actionable" (for the lag of a better term) for the user.
E.g. when should I use what? etc.

if (rz_socket_connect_tcp(s, "localhost", "9090", 0)) {
rz_socket_printf(s, "hello\n");
char buf[1024];
rz_socket_read(s, (ut8 *)buf, sizeof(buf));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least check the returned len here and print a message like received X bytes answer or something.

#include <stdlib.h>
#include <string.h>

#define ADDR "127.0.0.1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's 2026, I think we need to support both IPv4 and IPv6 actually.

@Rot127
Copy link
Member

Rot127 commented Jan 7, 2026

@Maijin I added a few comments.

But I have more fundamental concerns.
I don't feel good at all about the AI generated docs here and in the other PRs.

The are three reasons:
One reason is that this documentation will be used particularly by LLMs to generate answers. And we all know that people will trust this LLM output pretty much blindly.

Secondly, and more importantly it will be used by more advanced users directly. If they discover that there are subtle mistakes in there, they going to be really annoyed.

And lastly, others tried before using LLMs to generate these docs and it didn't end well (I think @moste00 did it a while back with https://deepdocs.dev/ IIRC?). The results were wrong or useless, sometimes in the details, sometimes completely. Even the correct stuff was too superficial to be useful.
But the worst thing was that it never was obviously wrong to a newcomer. So couldn't be detected by the people who actually need it.

Those are the reasons why I want to insist on precision.
I think documentation must be correct at the time of writing, on point and contain only information people need to use Rizin effectively (not more and not less).

Otherwise, the LLM generate output will be even less helpful, more nonsense and contain more subtle errors than LLMs already output (especially for C code).
Subtle mistakes or too much superficial long description of the code base will also confuse new comers and give a bad impression to advanced users.

I do not know how much of the text or code you'd generated. But in the READMEs I had the impression most of it? Apologies if this is not the case.

But I wouldn't like to merge this PR and the one with the Reg profile in the current form.
The others can over vote me of course, that is fine.

But to give concrete improvements ideas:

I think the example code is mostly fine. And can be generated partly with AI.
IF the example is commonly used. There shouldn't be code for the sake of having code.
I think good example code can be copy pasted to start a new project.
(Certain sanitation can be left out for brevity, but this must be highlighted with comments).

This PR's code is ok I think, but for the RzRegProfile I an not sure as I mentioned already.

The README files though should absolutely be re-written by hand IMHO.
They need to be more reduced I think, to the information people need and not more.

For example:

It could contain two sections. One API usage section and one section for developers.

API

The API section can give:

  • an intro about each module and what it was intended to be used for. This should focus on the value it provides to the user. Like "Module X solves that problem for you".
  • Then explain the main structs and what all the struct members are for.
  • Then explain a simple procedure how the API would be used (init server, init client socket, read, close socket).
  • Lastly add a link to the example code.

This should not be written with an LLM. Not even partly.

Dev

  • Describe the dir structure
  • what the architecture of the module is (very hard to do, should be done by the author).

I think the Dev section can be omitted for now. Because it requires too much knowledge and/or research to make it useful.

I am aware this takes way more time and effort.
Especially because it needs research into the code base and self-experimenting with the API with realistic use cases.

My only idea how to get a good start would be to solve simple CTFs with the Rizin API. This could give the hands on experience with the API to enable one to write good docs.
But I'm just spitballing here.

@moste00
Copy link
Contributor

moste00 commented Jan 7, 2026

And lastly, others tried before using LLMs to generate these docs and it didn't end well (I think @moste00 did it a while back with https://deepdocs.dev/ IIRC?).

To elaborate on that and because I'm totally an evangelist for deepwiki.com, I used it back in late August to generate docs for Capstone (one of Rizin's dependencies we vendor and maintain) and an internal tool inside it called AutoSync. Here's the link for the general overview of Capstone https://deepwiki.com/capstone-engine/capstone, and here's the exact question I asked it about disassembler auto generation https://deepwiki.com/search/summarize-how-the-autosync-too_63290c9a-d917-4f95-adea-3adb5bd01023.

Since then I really loved DeepWiki workflow and user ergonomics, no account setup, no sign in, no credit card, just replace "gitHub.com" with "deekwiki.com" and off you go, asking questions to the entire code base at hand. It shines not as a general documentation generator though, but as a targeted question answerer tool, and the code snippets it brings up is a really awesome way of building trust in the answer and to ensure that the model is not simply making stuff up.

For example I used it to know the following:

1- Does the meson build system support file:// URLs like browsers do in its dependency declaration files? There was no meson documentation on this, the maintainers actually refused to add it because it could encourage people to use this "accidental" feature more, but the point is DeepWiki brought up the test case exercising the feature in meson's source code, just like that.

2- Does the standard library of Erlang/OTP look up Windows' PATHEXT environment variable when searching for an executable to run ? (No, it hardcodes a special list of extensions and only looks for them, DeepWiki brought up the file with the executable search logic and the list)

3- etc..., just about any questions you have on a public repo, private repos require an account and a subscription

I largely second @Rot127 opinions about using LLMs for documentation in general, I do think that DeepWiki is an exception because it's always grounded in code, it's a "find in haystack" tool that relates your English query about the code with the code actually implementing the logic you want, even if the code is very large or spread across multiple files.

But yes, for anything "Big Picture", anything about architecture in general without a particular question to ground it, you do have to review LLM output and pass it by the maintainers and original authors, publishing a thing on a repo's README implies endorsement from the project.

@Maijin
Copy link
Member Author

Maijin commented Jan 8, 2026

Thanks both! I hear your feedback and i'll use them to improve on what need to be explicitly reviewed/created as sections and for what purposes. I think that is the main issue for me is that the issue itself is quite vague on what's expected for these README so the more articulated example we have the more we can replicate it over the rest.

"what the architecture of the module is (very hard to do, should be done by the author)" - Ideally a complete view of it would be awesome - but if we can get an accurate high-level for now (especially the API part as you mentionned) that'd be great already - I'm trying to keep these README succinct as well as to avoid huge refactor for every changes as well. But yes we should omit the example from the README as well, this is what we had in the issue so that's why I had added it. I'll refactor to only point to them.

I'll revised the socket and the reg one to incorporate the suggested changes and the structure change granted we have enough content for it. For me that was the second issue I think here for Socket/Reg since they're kind of a niche and most isn't used in the code - For example - the AVR MMIO is just one single item and not use that much. When I was looking at these I was even thinking that registry profiles should all be moved away from the analysis_ARCH files and moved over here instead of being hardcoded in but that's a different discussion.

@Rot127
Copy link
Member

Rot127 commented Jan 8, 2026

do have to review LLM output and pass it by the maintainers and original authors, publishing a thing on a repo's README implies endorsement from the project.

I'd like to emphasize here that we, as maintainers, cannot fact check everything all the time.
AI generated PRs increase the amount of work we have a lot! Because we basically have to do the correction. At some point we spent so much time on reviewing, that we could have just written it ourself.
But this negates the purpose of a contributions. Because the fact checking work and the effort to get it right was supposed to be done by the contributor.

  • but if we can get an accurate high-level for now (especially the API part as you mentionned) that'd be great already

I mean AI probably can help to generate the high level view and then you check if this is actually the reality. But if it generates garbage (not unlikely currently), you spent more time figuring out it is nonsense than by just doing it without AI.

should all be moved away from the analysis_ARCH files and moved over here instead of being hardcoded in but that's a different discussion.

Possibly yes, but it would be low priority because we have so many other construction sides which are more important. Reg profiles work reasonably well so far.

This was referenced Jan 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants