Skip to content

Commit e941d55

Browse files
Python plugin implementation (Ericsson#781)
1 parent aaf3814 commit e941d55

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+4308
-3
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@ nbproject/
1111
# clangd cache directory
1212
.cache/
1313

14+
# clangd cache
15+
.cache/
1416

1517
## Build folders
1618
build/

Config.cmake

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,9 @@ set(INSTALL_GEN_DIR "${INSTALL_SCRIPTS_DIR}/generated")
5151
# Installation directory for Java libraries
5252
set(INSTALL_JAVA_LIB_DIR "${INSTALL_LIB_DIR}/java")
5353

54+
# Installation directory for the Python plugin
55+
set(INSTALL_PYTHON_DIR "${INSTALL_LIB_DIR}/pythonplugin")
56+
5457
# Installation directory for executables
5558
set(INSTALL_BIN_DIR "bin")
5659

doc/pythonplugin.md

Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
# Python Plugin
2+
3+
## Parsing Python projects
4+
Python projects can be parsed by using the `CodeCompass_parser` executable.
5+
See its usage [in a seperate document](/doc/usage.md).
6+
7+
## Python specific parser flags
8+
9+
### Python dependencies
10+
Large Python projects usually have multiple Python package dependencies.
11+
Although a given project can be parsed without installing any of its dependencies, it is strongly recommended
12+
that the required modules are installed in order to achieve a complete parsing.
13+
To install a project's dependencies, create a [Python virtual environment](https://docs.python.org/3/library/venv.html)
14+
and install the necessary packages.
15+
When parsing a project, specify the virtual environment path so the parser can successfully resolve the dependencies:
16+
```
17+
--venvpath <path to virtual environment>
18+
```
19+
20+
### Type hints
21+
The parser can try to determine Python type hints for variables, expressions and functions.
22+
It can work out type hints such as `Iterable[int]` or `Union[int, str]`.
23+
However, this process can be extremely slow, especially for functions, thus it is disabled by default.
24+
It can be enabled using the `--type-hint` flag.
25+
26+
### Python submodules
27+
Large Python projects can have internal submodules and the parser tries to locate them automatically.
28+
Specifically, it looks for `__init__.py` files and considers those folders modules.
29+
This process is called submodule discovery and can be disabled using the `--disable-submodule-discovery` flag.
30+
31+
You can also add submodules manually by adding those specific paths to the parser's syspath:
32+
```
33+
--syspath <path>
34+
```
35+
For more information, see the [Python syspath docs](https://docs.python.org/3/library/sys.html#sys.path).
36+
37+
### File references
38+
By default, the parser works out references by looking for definitions only - if nodes share the same definition
39+
they are considered references.
40+
However, this method sometimes misses a few references (e.g. local variables in a function).
41+
To extend search for references in a file context, apply the `--file-refs` flag.
42+
Note that using this option can potentially extend the total parsing time.
43+
44+
## Examples of parsing Python projects
45+
46+
### Flask
47+
We downloaded [flask 3.1.0](https://github.com/pallets/flask/releases/tag/3.1.0) source code to `~/parsing/flask/`.
48+
The first step is to create a Python virtual environment and install flask's dependencies.
49+
Create a Python virtual environment and activate it:
50+
```bash
51+
cd ~/parsing/flask/
52+
python3 -m venv venv
53+
source venv/bin/activate
54+
```
55+
Next, we install the required dependencies listed in `pyproject.toml`.
56+
```bash
57+
pip install .
58+
```
59+
Further dependencies include development packages listed in `requirements/dev.txt`.
60+
These can be also installed using `pip`.
61+
```bash
62+
pip install -r requirements/dev.txt
63+
```
64+
Finally, we can run `CodeCompass_parser`.
65+
```bash
66+
CodeCompass_parser \
67+
-n flask \
68+
-i ~/parsing/flask/ \
69+
-w ~/parsing/workdir/ \
70+
-d "pgsql:host=localhost;port=5432;user=compass;password=pass;database=flask" \
71+
-f \
72+
--venvpath ~/parsing/flask/venv/ \
73+
--label src=~/parsing/flask/
74+
```
75+
76+
### CodeChecker
77+
We downloaded [CodeChecker 6.24.4](https://github.com/Ericsson/codechecker/releases/tag/v6.24.4) source code to `~/parsing/codechecker`.
78+
CodeChecker has an automated way of creating a Python virtual environment and installing dependencies - by running the `venv` target of a Makefile:
79+
```bash
80+
cd ~/parsing/codechecker/
81+
make venv
82+
```
83+
Next, we can run `CodeCompass_parser`.
84+
```bash
85+
CodeCompass_parser \
86+
-n codechecker \
87+
-i ~/parsing/codechecker/ \
88+
-w ~/parsing/workdir/ \
89+
-d "pgsql:host=localhost;port=5432;user=compass;password=pass;database=codechecker" \
90+
-f \
91+
--venvpath ~/parsing/codechecker/venv/ \
92+
--label src=~/parsing/codechecker/
93+
```
94+
95+
## Troubleshooting
96+
A few errors can occur during the parsing process, these are highlighted in color red.
97+
The stack trace is hidden by default, and can be shown using the `--stack-trace` flag.
98+
99+
### Failed to use virtual environment
100+
This error can appear if one specifies the `--venvpath` option during parsing.
101+
The parser tried to use the specified virtual environment path, however it failed.
102+
103+
#### Solution
104+
Double check that the Python virtual environment is correctly setup and its
105+
path is correct.
106+
If the error still persists, apply the `--stack-trace` parser option
107+
to view a more detailed stack trace of the error.
108+
109+
### Missing module (file = path line = number)
110+
In this case, the parser tried to parse a given Python file, however it
111+
could not find a definition for a module.
112+
Commonly, the Python file imports another module and the parser cannot locate this module.
113+
If this happens, the Python file is marked *partial* indicating that
114+
a module definition was not resolved in this file.
115+
The error message displays the module name, exact file path and line number
116+
so one can further troubleshoot this problem.
117+
118+
#### Solution
119+
Ensure that the `--venvpath` option is correctly specified and all the required
120+
dependencies are installed in that Python virtual environment.
121+
If the imported module is part of the parsed project, use the `--syspath` option
122+
and specify the directory where the module is located in.
123+

plugins/python/CMakeLists.txt

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
add_subdirectory(model)
2+
add_subdirectory(parser)
3+
add_subdirectory(service)
4+
add_subdirectory(test)
5+
6+
install_webplugin(webgui)
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
set(ODB_SOURCES
2+
include/model/pyname.h
3+
)
4+
5+
generate_odb_files("${ODB_SOURCES}")
6+
7+
add_odb_library(pythonmodel ${ODB_CXX_SOURCES})
8+
target_link_libraries(pythonmodel model)
9+
10+
install_sql()
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
#ifndef CC_MODEL_PYNAME_H
2+
#define CC_MODEL_PYNAME_H
3+
4+
#include <cstdint>
5+
#include <string>
6+
#include <odb/core.hxx>
7+
8+
namespace cc
9+
{
10+
namespace model
11+
{
12+
13+
enum PYNameID {
14+
ID,
15+
REF_ID,
16+
PARENT,
17+
PARENT_FUNCTION
18+
};
19+
20+
#pragma db object
21+
struct PYName
22+
{
23+
#pragma db id unique
24+
std::uint64_t id = 0;
25+
26+
#pragma db index
27+
std::uint64_t ref_id;
28+
29+
std::uint64_t parent;
30+
std::uint64_t parent_function;
31+
32+
bool is_definition = false;
33+
bool is_builtin = false;
34+
bool is_import = false;
35+
bool is_call = false;
36+
std::string full_name;
37+
std::string value;
38+
std::string type;
39+
std::string type_hint;
40+
41+
std::uint64_t line_start;
42+
std::uint64_t line_end;
43+
std::uint64_t column_start;
44+
std::uint64_t column_end;
45+
46+
#pragma db index
47+
std::uint64_t file_id;
48+
};
49+
50+
}
51+
}
52+
53+
#endif

plugins/python/parser/.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
venv/
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
find_package(Python3 REQUIRED COMPONENTS Interpreter Development)
2+
find_package(Boost REQUIRED COMPONENTS python)
3+
4+
include_directories(
5+
include
6+
${PROJECT_SOURCE_DIR}/model/include
7+
${PROJECT_SOURCE_DIR}/util/include
8+
${PROJECT_SOURCE_DIR}/parser/include
9+
${PLUGIN_DIR}/model/include)
10+
11+
include_directories(SYSTEM
12+
${Boost_INCLUDE_DIRS}
13+
${Python3_INCLUDE_DIRS})
14+
15+
add_library(pythonparser SHARED
16+
src/pythonparser.cpp)
17+
18+
target_link_libraries(pythonparser
19+
model
20+
pythonmodel
21+
${Boost_LIBRARIES}
22+
${Python3_LIBRARIES})
23+
24+
target_compile_options(pythonparser PUBLIC -Wno-unknown-pragmas)
25+
26+
set(VENV_DIR "${PLUGIN_DIR}/parser/venv/")
27+
if(NOT EXISTS ${VENV_DIR})
28+
message("Creating Python virtual environment: ${VENV_DIR}")
29+
execute_process(
30+
COMMAND python3 -m venv venv
31+
WORKING_DIRECTORY ${PLUGIN_DIR}/parser/)
32+
endif()
33+
34+
message("Installing Python dependencies...")
35+
execute_process(
36+
COMMAND venv/bin/pip install -r requirements.txt
37+
WORKING_DIRECTORY ${PLUGIN_DIR}/parser/)
38+
39+
install(TARGETS pythonparser DESTINATION ${INSTALL_PARSER_DIR})
40+
install(
41+
DIRECTORY pyparser/
42+
DESTINATION ${INSTALL_PYTHON_DIR}/pyparser)
43+
install(
44+
DIRECTORY venv/
45+
DESTINATION ${INSTALL_PYTHON_DIR}/venv)
46+
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
#ifndef CC_PARSER_PYTHONPARSER_H
2+
#define CC_PARSER_PYTHONPARSER_H
3+
4+
#include <string>
5+
#include <vector>
6+
#include <map>
7+
#include <parser/abstractparser.h>
8+
#include <parser/parsercontext.h>
9+
#include <parser/sourcemanager.h>
10+
#include <util/parserutil.h>
11+
#include <boost/python.hpp>
12+
#include <model/pyname.h>
13+
#include <model/pyname-odb.hxx>
14+
namespace cc
15+
{
16+
namespace parser
17+
{
18+
19+
namespace python = boost::python;
20+
21+
typedef std::unordered_map<std::uint64_t, model::PYName> PYNameMap;
22+
23+
class PythonParser : public AbstractParser
24+
{
25+
public:
26+
PythonParser(ParserContext& ctx_);
27+
virtual ~PythonParser();
28+
virtual bool parse() override;
29+
private:
30+
struct ParseResultStats {
31+
std::uint32_t partial;
32+
std::uint32_t full;
33+
};
34+
35+
python::object m_py_module;
36+
void processFile(const python::object& obj, PYNameMap& map, ParseResultStats& parse_result);
37+
void parseProject(const std::string& root_path);
38+
};
39+
40+
} // parser
41+
} // cc
42+
43+
#endif // CC_PARSER_PYTHONPARSER_H

0 commit comments

Comments
 (0)