-
Notifications
You must be signed in to change notification settings - Fork 15.4k
Closed as not planned
Closed as not planned
Copy link
Labels
clang:as-a-librarylibclang and C++ APIlibclang and C++ APIquestionA question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!
Description
Hi everyone,
I'm trying to parse some header files to extract signatures using clang.cindex in Python. Sometimes it detects types as names wrongly.
The following code from musl is an example:
ssize_t recvfrom (int, void *__restrict, size_t, int, struct sockaddr *__restrict, socklen_t *__restrict);from: https://git.musl-libc.org/cgit/musl/tree/include/sys/socket.h#n397
And I get size_t as the name of the next argument! ->
ssize_t recvfrom(int , void *restrict , int size_t, int , struct sockaddr *restrict , socklen_t *restrict )My code:
import sys
import clang.cindex
header_file = sys.argv[1]
index = clang.cindex.Index.create()
tu = index.parse(header_file)
for cursor in tu.cursor.get_children():
if cursor.kind == clang.cindex.CursorKind.FUNCTION_DECL:
print(cursor.result_type.spelling, cursor.spelling, end='(')
is_first_arg = False
for arg in cursor.get_arguments():
if not is_first_arg:
is_first_arg = True
else:
print(', ', end='')
print(arg.type.spelling, arg.spelling, end='')
print(')')My command:
python poc.py musl-1.2.5/include/sys/socket.hThe output:
__uint16_t __bswap_16(__uint16_t __bsx)
__uint32_t __bswap_32(__uint32_t __bsx)
__uint64_t __bswap_64(__uint64_t __bsx)
__uint16_t __uint16_identity(__uint16_t __x)
__uint32_t __uint32_identity(__uint32_t __x)
__uint64_t __uint64_identity(__uint64_t __x)
int select(int __nfds, fd_set *restrict __readfds, fd_set *restrict __writefds, fd_set *restrict __exceptfds, struct timeval *restrict __timeout)
int pselect(int __nfds, fd_set *restrict __readfds, fd_set *restrict __writefds, fd_set *restrict __exceptfds, const struct timespec *restrict __timeout, const __sigset_t *restrict __sigmask)
struct cmsghdr * __cmsg_nxthdr(struct msghdr * __mhdr, struct cmsghdr * __cmsg)
int socket(int , int , int )
int socketpair(int , int , int , int[2] )
int shutdown(int , int )
int bind(int , const struct sockaddr * , socklen_t )
int connect(int , const struct sockaddr * , socklen_t )
int listen(int , int )
int accept(int , struct sockaddr *restrict , socklen_t *restrict )
int accept4(int , struct sockaddr *restrict , socklen_t *restrict , int )
int getsockname(int , struct sockaddr *restrict , socklen_t *restrict )
int getpeername(int , struct sockaddr *restrict , socklen_t *restrict )
ssize_t send(int , const void * , int size_t, int )
ssize_t recv(int , void * , int size_t, int )
ssize_t sendto(int , const void * , int size_t, int , const struct sockaddr * , socklen_t )
ssize_t recvfrom(int , void *restrict , int size_t, int , struct sockaddr *restrict , socklen_t *restrict )
ssize_t sendmsg(int , const struct msghdr * , int )
ssize_t recvmsg(int , struct msghdr * , int )
int getsockopt(int , int , int , void *restrict , socklen_t *restrict )
int setsockopt(int , int , int , const void * , socklen_t )
int sockatmark(int )The result of clang looks to be fine when I try this command:
clang -Xclang -ast-dump=json -fsyntax-only include/sys/socket.hThe issue seems to be in clang_getCursorSpelling() in clang/tools/libclang/CIndex.cpp not the Python binding.
Metadata
Metadata
Assignees
Labels
clang:as-a-librarylibclang and C++ APIlibclang and C++ APIquestionA question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!