diff --git a/.github/dictionary.txt b/.github/dictionary.txt index 6f7d8789ca..aab2e8f804 100644 --- a/.github/dictionary.txt +++ b/.github/dictionary.txt @@ -1,28 +1,31 @@ -setbit +additionals +backpressure +blpop +buildx +Certicom +chacha +connmanager +dialback +dout getbit -stopstr -rlflx -incrby hopr -dout -supercop +incrby +microtask nothrow -buildx -blpop +peerStore +reprovide +reprovided +reprovider +reprovides +reproviding +rlflx rpush -additionals -SECG -Certicom RSAES +SECG +setbit +stopstr +supercop unuse -dialback -chacha -peerStore +userland xxhandshake zerolen -connmanager -reprovide -reprovider -reproviding -reprovides -reprovided diff --git a/doc/migrations/v2.0.0-v3.0.0.md b/doc/migrations/v2.0.0-v3.0.0.md new file mode 100644 index 0000000000..cc9a1baaa2 --- /dev/null +++ b/doc/migrations/v2.0.0-v3.0.0.md @@ -0,0 +1,451 @@ + +# Migrating to libp2p@2.0 + +A migration guide for refactoring your application code from libp2p `v2.0.0` to `v3.0.0`. + +## Table of Contents + +- [Streams are now EventTargets](#streams-are-now-eventtargets) + - [Stream closing](#stream-closing) + - [Write backpressure](#write-backpressure) + - [Imperative streams](#imperative-streams) + - [byteStream](#bytestream) + - [lengthPrefixedStream](#lengthprefixedstream) + - [protobufStream](#protobufstream) +- [Protocol handlers now accept a stream and a connection](#protocol-handlers-now-accept-a-stream-and-a-connection) +- [Protocol handlers can now be async](#protocol-handlers-can-now-be-async) +- [Stream middleware](#stream-middleware) +- [Lightweight Multiaddrs](#lightweight-multiaddrs) +- [Deprecated code has been removed](#deprecated-code-has-been-removed) + +## Streams are now EventTargets + +When the JavaScript implementation of libp2p first came into being it used +[Node.js streams](https://nodejs.org/api/stream.html). At the time these were high +performance but non-standard and could be hard to work with, particularly around +error handling and propagation. + +Next came [pull-streams](https://github.com/pull-stream/pull-stream). These were +a userland attempt to simplify the streaming model and invert control so stream +sinks pull data from stream sources when they are ready, giving you internal +stream [backpressure](https://medium.com/@jayphelps/backpressure-explained-the-flow-of-data-through-software-2350b3e77ce7) +almost for free. Unfortunately they were also quite hard to work with and the +community never really coalesced around them as a standard so development has +slowed to a practical standstill. + +Meanwhile the JavaScript language kept evolving and introduced [AsyncIterators](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/AsyncIterator) and the [for await...of](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/for-await...of) +loop, a way of pulling data from a source in an async manner, with implicit +backpressure and ways to close the stream early (i.e. you `break` from the +loop). [Streaming Iterables](https://gist.github.com/alanshaw/591dc7dd54e4f99338a347ef568d6ee9) +were born, these follow the design ideas from `pull-streams` but are implemented +using standard language constructs. + +In the intervening years, Streaming Iterables have not really been adopted +outside of libp2p, and that unfamiliarity raises the barrier to entry for new +developers. + +Streaming Iterables lean heavily on promises to function, and encourage a +programming model that means async transforms cause downstream transforms to +also become async which can add a [surprising amount of latency](https://github.com/ChainSafe/js-libp2p-gossipsub/pull/361) +if the transform is trivial. This makes it harder to write high performance code +easily, since the user must consider the price of async at all times. + +Both pull-streams and Streaming Iterables use implicit backpressure which means +the producer that is being throttled is not notified so can end up overloading a +system if it continues producing data that might go into an internal buffer. + +If you have done profiling of a high-traffic libp2p application you may see many +Promises resident in memory, this is a result of implicit backpressure and +can cause increased GC churn when data starts to flow again. + +There are also network-level backpressure capabilities that could be used if the +producers were notified, for example the Yamux stream muxer can prevent a remote +peer from sending more data by not increasing the send window while the local +system is dealing with the current slice of data. + +Fast forward to the present day. The [EventTarget](https://developer.mozilla.org/en-US/docs/Web/API/EventTarget) +API now exists as part of the language which has standardized the pattern used +by the Node.js [EventEmitter](https://nodejs.org/docs/latest/api/events.html), +heavily utilized by it's streaming primitives. + +Events dispatched by EventTargets are synchronous which makes them very high +performance since continuations are not performed in the [microtask queue](https://javascript.info/event-loop). + +This means that more work can be performed in the current context without using +`await` which introduces a tiny amount of latency. + +As a consequence of this we have seen a [modest increase in throughput](https://observablehq.com/@libp2p-workspace/performance-dashboard?branch=2db84fb7a5402e1cd7f09616f98efbed7609ba5f) with `EventTarget` streams but +the real winner is the decrease in min/max values and the tightening of the +result set values - this is largely down to reductions in latency achieved by +removing unnecessary async behavior. + +This does mean that the streaming API has changed. Where before we had `source` +and `sink` properties, we now listen for `message` events and call `.send` with +`Uint8Array`/`Uint8ArrayList` values. + +We can detect when write backpressure needs to be applied by `.send` returning +`false` (we should await a `drain` event when this happens). and read +backpressure can be explicitly applied by calling the new `.pause` and `.resume` +methods. + +> [!CAUTION] +> If no `message` event handler is added, streams will buffer incoming data +> until a pre-configured limited is reached, after which the stream will be +> reset (see the `maxReadBufferLength` init param exposed by multiplexers and +> transports that supply their own multiplexer). +> +> If you need to perform a long-running async task in your stream handler before +> consuming stream data you should `.pause` the stream first and `.resume` it +> when you are ready. +> +> Where possible this will signal to the data source that it should not send any +> more data. + +**Before** + +```ts +import { createLibp2p } from 'libp2p' +import { peerIdFromString } from '@libp2p/peer-id' + +const node = createLibp2p({ + // libp2p config here +}) + +const remotePeer = peerIdFromString('123Foo...') +const stream = await node.dialProtocol(remotePeer, '/echo/1.0.0', { + signal: AbortSignal.timeout(5_000) +}) + +stream.sink([ + new TextEncoder().encode('hello world') +]) + .catch(err => { + stream.abort(err) + }) + +for await (const buf of stream.source) { + console.info(new TextDecoder().decode(buf)) // prints 'hello world' +} +``` + +**After** + +```ts +import { createLibp2p } from 'libp2p' +import { peerIdFromString } from '@libp2p/peer-id' + +const node = createLibp2p({ + // libp2p config here +}) + +const remotePeer = peerIdFromString('123Foo...') +const stream = await node.dialProtocol(remotePeer, '/echo/1.0.0', { + signal: AbortSignal.timeout(5_000) +}) + +stream.addEventListener('message', event => { + console.info(new TextDecoder().decode(event.data)) // prints 'hello world' +}) + +stream.send(new TextEncoder().encode('hello world')) +``` + +> [!TIP] +> Streams are still `AsyncIterable` so you can still use `for await...of` to iterate over their contents: + +```ts +import { createLibp2p } from 'libp2p' +import { peerIdFromString } from '@libp2p/peer-id' + +const node = createLibp2p({ + // libp2p config here +}) + +const remotePeer = peerIdFromString('123Foo...') +const stream = await node.dialProtocol(remotePeer, '/echo/1.0.0', { + signal: AbortSignal.timeout(5_000) +}) + +stream.send(new TextEncoder().encode('hello world')) + +for await (const buf of stream) { + console.info(new TextDecoder().decode(event.data)) // prints 'hello world' +} +``` + +### Stream closing + +When streams close they emit a `close` event. This event has an `error: Error` property that can be used to detect an unclean exit, and a `local: boolean` property that informs the user if it was a local abort or a remote reset. + +### Write backpressure + +You can use [p-event](https://www.npmjs.com/package/p-event) or [race-event](https://www.npmjs.com/package/race-event) to pause writing due to backpressure: + +```ts +import { createLibp2p } from 'libp2p' +import { peerIdFromString } from '@libp2p/peer-id' + +const node = createLibp2p({ + // libp2p config here +}) + +const remotePeer = peerIdFromString('123Foo...') +const stream = await node.dialProtocol(remotePeer, '/echo/1.0.0', { + signal: AbortSignal.timeout(5_000) +}) + +const bufs = [ + // a lot of data +] + +for (const buf of bufs) { + if (!stream.send(buf)) { + await pEvent(stream, 'drain', { + rejectionEvents: [ + 'close' + ] + }) + } +} +``` + +### Imperative streams + +The `@libp2p/utils` module now exports some functions to make imperative stream programming simpler. These are largely ported from the [it-protobuf-stream](https://www.npmjs.com/package/it-protobuf-stream), [it-length-prefixed-stream](https://www.npmjs.com/package/it-length-prefixed-stream) and [it-byte-stream](https://www.npmjs.com/package/it-byte-stream) modules. + +#### byteStream + +The `byteStream` module lets you read/write arbitrary amounts of bytes to/from the stream in an imperative style. The `read` method accepts a `bytes` +option which will resolve the returned promise once that number of bytes have +been received, otherwise it'll just return whatever bytes were read in the +last chunk of data received from the underlying stream. + +```ts +import { createLibp2p } from 'libp2p' +import { peerIdFromString } from '@libp2p/peer-id' +import { byteStream } from '@libp2p/utils' + +const node = createLibp2p({ + // libp2p config here +}) + +const remotePeer = peerIdFromString('123Foo...') +const stream = await node.dialProtocol(remotePeer, '/echo/1.0.0', { + signal: AbortSignal.timeout(5_000) +}) + +const bytes = byteStream(stream) + +await bytes.write(Uint8Array.from([0, 1, 2, 3]), { + signal: AbortSignal.timeout(5_000) +}) + +const output = await bytes.read({ + signal: AbortSignal.timeout(5_000) +}) + +console.info(output) // Uint8Array([0, 1, 2, 3]) +``` + +#### lengthPrefixedStream + +The `lengthPrefixedStream` module lets you read/write arbitrary amounts of bytes +to/from the stream in an imperative style. + +All data written to the stream is prefixed with a [varint](https://protobuf.dev/programming-guides/encoding/#varints) +that contains the number of bytes in the following message. + +```ts +import { createLibp2p } from 'libp2p' +import { peerIdFromString } from '@libp2p/peer-id' +import { lengthPrefixedStream } from '@libp2p/utils' + +const node = createLibp2p({ + // libp2p config here +}) + +const remotePeer = peerIdFromString('123Foo...') +const stream = await node.dialProtocol(remotePeer, '/echo/1.0.0', { + signal: AbortSignal.timeout(5_000) +}) + +const lp = lengthPrefixedStream(stream) + +await lp.write(Uint8Array.from([0, 1, 2, 3]), { + signal: AbortSignal.timeout(5_000) +}) + +const output = await lp.read({ + signal: AbortSignal.timeout(5_000) +}) + +console.info(output) // Uint8Array([0, 1, 2, 3]) +``` + +#### protobufStream + +The `protobufStream` module lets you read/write [protobuf](https://protobuf.dev) +messages to/from the stream in an imperative style. + +In the example below the `Message` class is generated from a `.proto` +file using [protons](http://npmjs.com/package/protons). Other protobuf +encoders/decoders are available. + +```ts +import { createLibp2p } from 'libp2p' +import { peerIdFromString } from '@libp2p/peer-id' +import { protobufStream } from '@libp2p/utils' +import { Message } from './hello-world.js' + +const node = createLibp2p({ + // libp2p config here +}) + +const remotePeer = peerIdFromString('123Foo...') +const stream = await node.dialProtocol(remotePeer, '/echo/1.0.0', { + signal: AbortSignal.timeout(5_000) +}) + +const pb = protobufStream(stream) + +await pb.write({ + hello: 'world' +}, Message, { + signal: AbortSignal.timeout(5_000) +}) + +const output = await pb.read({ + signal: AbortSignal.timeout(5_000) +}) + +console.info(output) // { hello: 'world' } +``` + +## Protocol handlers now accept a stream and a connection + +Prior to v3 protocol handlers would accept an object with `stream` and +`connection` properties. These have been split into two arguments. + +**Before** + +```ts +import { createLibp2p } from 'libp2p' + +const node = createLibp2p({ + // libp2p config here +}) + +node.handle('/my/protocol', ({ stream, connection }) => { + // read/write stream data here +}) +``` + +**After** + +```ts +import { createLibp2p } from 'libp2p' + +const node = createLibp2p({ + // libp2p config here +}) + +node.handle('/my/protocol', async (stream, connection) => { + // read/write stream data here +}) +``` + +## Protocol handlers can now be async + +Prior to `libp2p@3.x.x` protocol handlers had to be synchronous methods. + +It was very common to need to perform some async work in a protocol handler so +a common pattern was to use `Promise.resolve().then(...)` and perform the +continuation in the `then` callback. + +From v3 they can return promises to improve developer experience a tiny amount. + +If the returned promise rejects the stream will be aborted using the rejection +reason. + +**Before** + +```ts +import { createLibp2p } from 'libp2p' + +const node = createLibp2p({ + // libp2p config here +}) + +node.handle('/my/protocol', ({ stream, connection }) => { + Promise.resolve().then(async () => { + for await (const buf of stream) { + //... process stream data + } + }) + .catch(err => { + stream.abort(err) + }) +}) +``` + +**After** + +```ts +import { createLibp2p } from 'libp2p' + +const node = createLibp2p({ + // libp2p config here +}) + +node.handle('/my/protocol', async (stream, connection) => { + for await (const buf of stream) { + //... process stream data + } +}) +``` + +## Stream middleware + +It's possible to intercept incoming streams outside of protocol handlers to change their contents or even deny access. + +The implementation takes inspiration from the venerable [express](https://expressjs.com/) framework so it should be familiar to developers. + +Middleware functions can be sync or async, if they throw/reject the stream +will be aborted using the thrown error/rejection reason. + +```ts +import { createLibp2p } from 'libp2p' + +const node = createLibp2p({ + // libp2p config here +}) + +node.use('/my/protocol', async (stream, connection, next) => { + // perform middleware actions here + + next(stream, connection) +}) +``` + +## Lightweight Multiaddrs + +`@multiformats/multiaddr@13.x.x` has had a large amount of code/functionality +removed, most notably around resolution of DNS tuples to IP addresses and other +functionality that would normally only happen as part of a libp2p dial. + +It has also had some baked-in assumptions about the structure of a multiaddr +removed, and it's now possible to access multiaddr tuples with string and +numeric components without having to wade through many arrays inside of other +arrays. + +If you have a dependency on `@multiformats/multiaddr`, please upgrade it to +`13.x.x` for use with `libp2p@3.x.x`. + +## Deprecated code has been removed + +All fields/methods/classes/etc marked as `@deprecated` in `libp2p@2.x.x` have +been removed. + +All removed fields had JSDoc instructions of what to use instead or if they were +redundant and use should just be removed, please refer to them to upgrade your +app.