Skip to content

Commit fe3824c

Browse files
committed
Python: Document API graphs
1 parent ad35c01 commit fe3824c

File tree

1 file changed

+164
-0
lines changed

1 file changed

+164
-0
lines changed
Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
.. _using-api-graphs-in-python:
2+
3+
Using API graphs in Python
4+
==========================
5+
6+
API graphs are a uniform interface for referring to functions, classes, and methods defined in
7+
external libraries.
8+
9+
About this article
10+
------------------
11+
12+
This article describes how to use API graphs to reference classes and functions defined in library
13+
code. This can be used to conveniently refer to external library functions when defining things like
14+
remote flow sources.
15+
16+
17+
Module imports
18+
--------------
19+
20+
The most common entry point into the API graph will be the point where an external module or package is
21+
imported. The API graph node corresponding to the ``re`` library, for instance, can be accessed
22+
using the ``API::moduleImport`` method defined in the ``semmle.python.ApiGraphs`` module, as the
23+
following snippet demonstrates.
24+
25+
.. code-block:: ql
26+
27+
import python
28+
import semmle.python.ApiGraphs
29+
30+
select API::moduleImport("re")
31+
32+
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/1876172022264324639/>`__.
33+
34+
On its own, this only selects the API graph node corresponding to the ``re`` module. To find
35+
where this module is referenced, we use the ``getAUse`` method. Thus, the following query selects
36+
all references to the ``re`` module in the current database.
37+
38+
.. code-block:: ql
39+
40+
import python
41+
import semmle.python.ApiGraphs
42+
43+
select API::moduleImport("re").getAUse()
44+
45+
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/8072356519514905526/>`__.
46+
47+
Note that the ``getAUse`` method accounts for local flow, so that ``my_re_compile``
48+
in the following snippet is
49+
correctly recognized as a reference to the ``re.compile`` function.
50+
51+
.. code-block:: python
52+
53+
from re import compile as re_compile
54+
55+
my_re_compile = re_compile
56+
57+
r = my_re_compile(".*")
58+
59+
If only immediate uses are required, without taking local flow into account, then the method
60+
``getAnImmediateUse`` may be used instead.
61+
62+
Note that the given module name *must not* contain any dots. Thus, something like
63+
``API::moduleImport("flask.views")`` will not do what you expect. Instead, this should be decomposed
64+
into an access of the ``views`` member of the API graph node for ``flask``, as described in the next
65+
section.
66+
67+
Accessing attributes
68+
--------------------
69+
70+
Given a node in the API graph, we may access its attributes by using the ``getMember`` method. Using
71+
the above ``re.compile`` example, we may now find references to ``re.compile`` by doing
72+
73+
.. code-block:: ql
74+
75+
import python
76+
import semmle.python.ApiGraphs
77+
78+
select API::moduleImport("re").getMember("compile").getAUse()
79+
80+
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/7970570434725297676/>`__.
81+
82+
In addition to ``getMember``, the method ``getUnknownMember`` can be used to find references to API
83+
components where the name is not known statically, and the ``getAMember`` method can be used to
84+
access all members, both known and unknown.
85+
86+
Calls and class instantiations
87+
------------------------------
88+
89+
To track instances of classes defined in external libraries, or the results of calling externally
90+
defined functions, we may use the ``getReturn`` method. Thus, the following snippet finds all places
91+
where the return value of ``re.compile`` is used:
92+
93+
.. code-block:: ql
94+
95+
import python
96+
import semmle.python.ApiGraphs
97+
98+
select API::moduleImport("re").getMember("compile").getReturn().getAUse()
99+
100+
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/4346050399960356921/>`__.
101+
102+
Note that this includes all uses of the result of ``re.compile``, including those reachable via
103+
local flow. To get just the *calls* to ``re.compile``, we can use ``getAnImmediateUse`` instead of
104+
``getAUse``. As this is a common occurrence, the method ``getACall`` can be used instead of
105+
``getReturn`` followed by ``getAnImmediateUse``.
106+
107+
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/8143347716552092926/>`__.
108+
109+
Note that the API graph does not distinguish between class instantiations and function calls. As far
110+
as it's concerned, both are simply places where an API graph node is called.
111+
112+
Subclasses
113+
----------
114+
115+
For many libraries, the main mode of usage is to extend one or more library classes. To track this
116+
in the API graph, we can use the ``getASubclass`` method to get the API graph node corresponding to
117+
all the immediate subclasses of this node. To find *all* subclasses, use ``*`` or ``+`` to apply the
118+
method repeatedly, as in `getASubclass*`.
119+
120+
Note that ``getASubclass`` does not account for any subclassing that takes place in library code
121+
that has not been extracted. Thus, it may be necessary to account for this in the models you write.
122+
For example, the ``flask.views.View`` class has a predefined subclass ``MethodView``, and so to find
123+
all subclasses of ``View``, we must explicitly include the subclasses of ``MethodView`` as well.
124+
125+
.. code-block:: ql
126+
127+
import python
128+
import semmle.python.ApiGraphs
129+
130+
API::Node viewClass() {
131+
result =
132+
API::moduleImport("flask").getMember("views").getMember(["View", "MethodView"]).getASubclass*()
133+
}
134+
135+
select viewClass()
136+
137+
➤ `See this in the query console on LGTM.com <https://lgtm.com/query/288293322319747121/>`__.
138+
139+
Note the use of the set literal ``["View", "MethodView"]`` to match both classes simultaneously.
140+
141+
Built-in functions and classes
142+
------------------------------
143+
144+
Built-in functions and classes can be accessed using the ``API::builtin`` method, giving the name of
145+
the built-in as an argument.
146+
147+
To find all calls to the built-in ``open`` function, for instance, can be done using the following snippet
148+
149+
.. code-block:: ql
150+
151+
import python
152+
import semmle.python.ApiGraphs
153+
154+
select API::builtin("open").getACall()
155+
156+
157+
158+
159+
Further reading
160+
---------------
161+
162+
163+
.. include:: ../reusables/python-further-reading.rst
164+
.. include:: ../reusables/codeql-ref-tools-further-reading.rst

0 commit comments

Comments
 (0)