Conversation
5e12eb5 to
360209e
Compare
360209e to
886838b
Compare
|
Deployed to dev and ran a test with |
There was a problem hiding this comment.
Great catch, and thanks for your contribution! This will definitely improve Composer package harvesting.
Just a note: ClearlyDefined also tries to harvest source (e.g., GitHub) components during source discovery, using registry metadata. Composer v2 metadata (p2 API) is compressed—only the latest version has full metadata, while older versions are diffs and may miss key fields like homepage, dist, and source. These fields are used during source discovery in composerExtract.js.
To restore full metadata for any version, you can use logic similar to Composer\MetadataMinifier\MetadataMinifier::expand() (see "Getting the Package Data, Using the Composer v2 metadata" at https://packagist.org/apidoc). Alternatively, the package API provides complete metadata for all versions (see "Getting the Package Data, Using the API" at https://packagist.org/apidoc).
This pull request addresses most of the cases. Some cases that rely on the full registry data can be addressed in a separate pull request.
A previous issue in the code: providerMap.packagist contains trailing /, so the template string results in `https://repo.packagist.org//p2/symfony/polyfill-mbstring.json`. This removes the extra slash.
|
Adding in the expand process is more involved and better to push off to another PR. |
|
@elrayle In the integration test run, Failure Case 1 (expected to return the same definition as production for composer/packagist/symfony/polyfill-mbstring/v1.28.0) and Failure Case 3 (expected to return the same notice as production for the same package and version) are likely caused by changes in the registry data. In both cases, the discrepancies are related to the project website information retrieved from the registry. Expanding the registry data for older versions likely will resolve these issues. |
The composer API in use for importing from the package manager is deprecated as of Sept 1. It now returns a 403 for all requests.
This PR updates packagist calls to the
p2endpoint. The format of that endpoint has changed. The mock data is the result of the actual call to that endpoint.Added a test to directly test that
_getRegistryDataprocesses data in thep2format.