You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -52,7 +52,7 @@ Here are a few example request objects.
52
52
}
53
53
```
54
54
55
-
The request `type` describes the crawling activity being requested. For example, "do `package` crawling". It is typically the same as the `type` in the url (see below). There are some more advanced scenarios where the two values are different but for starters, treat them as the same. The general form of a request URL is (note: it is a URL because of the underlying crawling infrastructure, the `cd` scheme is not particularly relevant)
55
+
The request `type` describes the crawling activity being requested. For example, "do `package` crawling" (see [More on type](#more-on-type) for a description of valid type values). It is typically the same as the `type` in the url (see segments description below). There are some more advanced scenarios where the two values are different but for starters, treat them as the same. The general form of a request URL is (note: it is a URL because of the underlying crawling infrastructure, the `cd` scheme is not particularly relevant)
56
56
57
57
```
58
58
cd:/type/provider/namespace/name/revision
@@ -80,6 +80,18 @@ Process the source, if any:
80
80
81
81
The crawler's output is stored for use by the rest of the ClearlyDefined infrastructure -- it is not intended to be used directly by humans. Note that each tool's output is stored separately and the results of processing the component and the component source are also separated.
82
82
83
+
### <aid="more-on-type"></a>More on `type`
84
+
The `type` in the request object typically corresponds to an internal processor in CD.
85
+
1.`component` is the most generic type. Internally, it is converted to a `package` or `source` request by the component processor.
86
+
2.`package` request is processed by the package processor and is further converted to a request with a specific type (`crate`, `deb`, `gem`, `go`, `maven`, `npm`, `nuget`, `composer`, `pod`, `pypi`). For a `package` typed request, if the mentioned specific binary package type is known, the specific type (e.g. `npm`) can be used (instead of `package`) in the harvest request and skip the conversion step. For example,
87
+
```json
88
+
{
89
+
"type": "npm",
90
+
"url": "cd:/npm/npmjs/-/redie/0.3.0"
91
+
}
92
+
```
93
+
3.`source` requests are processed by the source processor, which subsequently dispatches a `clearlydefined` typed request for the supported source types and other requests (one for each scanning tool). These are the more advanced scenarios where the request type and the coordinate type differ.
94
+
83
95
# Configuration
84
96
85
97
The crawler is quite configuable. Out of the box it is setup for demo-level use directly on your computer. In its full glory it can run with arbitrarily many distributed clients using an array of different queuing, caching and storage technologies.
@@ -121,7 +133,7 @@ If a CRAWLER_ID is specified, then each instance must have this setting globally
121
133
## Run Docker image from Docker Hub
122
134
123
135
You can run the image as is from docker (this is w/o any port forwarding, which means the only way you can interact with the crawler locally is through the queue. See below for examples of how to run with ports exposed to do curl based testing).
124
-
`docker run --env-file ../<env_name>.env.list clearlydefined/crawler`
136
+
`docker run --platform linux/amd64 --env-file ../<env_name>.env.list clearlydefined/crawler`
125
137
126
138
See `local.env.list`, `dev.env.list` and `prod.env.list` tempate files.
127
139
@@ -133,13 +145,13 @@ See `local.env.list`, `dev.env.list` and `prod.env.list` tempate files.
0 commit comments