Skip to content

Commit fa38a0a

Browse files
aschmahmannlidel
andauthored
feat: enable using an HTTP block provider as a routing backend (#110)
* feat: enable using an HTTP block provider as a routing backend * clarify http paths are not supported Co-authored-by: Marcin Rataj <[email protected]> * clarify only https and not http is allowed Co-authored-by: Marcin Rataj <[email protected]> * add changelog for http block providers flags * feat: switch accelerated-dht bool option to dht string option to allow disabling the dht entirely * fix: set Accept on requests * fix: 5s timeout for HEAD checks - finish sooner rather than waiting 30s (DefaultRoutingTimeout) - dont return unactionable errors to end user, everything other than HTTP 200 should produce empty /routing/v1/providers result set * test: TestHTTPBlockRouter * docs: environment-variables.md --------- Co-authored-by: Marcin Rataj <[email protected]>
1 parent 8269cc6 commit fa38a0a

File tree

7 files changed

+399
-32
lines changed

7 files changed

+399
-32
lines changed

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,13 @@ The following emojis are used to highlight certain changes:
1515

1616
### Added
1717

18+
- Added `http-block-provider-endpoints` and `http-block-provider-peerids` options to enable using a [trustless HTTP gateway](https://specs.ipfs.tech/http-gateways/trustless-gateway/) as a source for synthetic content routing records.
19+
- When the configured gateway responds with HTTP 200 to an HTTP HEAD request for a block (`HEAD /ipfs/{cid}?format=raw`), `FindProviders` returns a provider record containing a predefined PeerID and the HTTP gateway as a multiaddr with `/tls/http` suffix.
20+
1821
### Changed
1922

23+
- `accelerated-dht` option was removed and replaced with a `dht` option which enables toggling between the standard client, accelerated client and being disabled
24+
2025
### Removed
2126

2227
### Fixed

docs/environment-variables.md

Lines changed: 24 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,15 @@
44

55
- [Configuration](#configuration)
66
- [`SOMEGUY_LISTEN_ADDRESS`](#someguy_listen_address)
7-
- [`SOMEGUY_ACCELERATED_DHT`](#someguy_accelerated_dht)
7+
- [`SOMEGUY_DHT`](#someguy_dht)
8+
- [`SOMEGUY_CACHED_ADDR_BOOK`](#someguy_cached_addr_book)
9+
- [`SOMEGUY_CACHED_ADDR_BOOK_RECENT_TTL`](#someguy_cached_addr_book_recent_ttl)
10+
- [`SOMEGUY_CACHED_ADDR_BOOK_ACTIVE_PROBING`](#someguy_cached_addr_book_active_probing)
811
- [`SOMEGUY_PROVIDER_ENDPOINTS`](#someguy_provider_endpoints)
912
- [`SOMEGUY_PEER_ENDPOINTS`](#someguy_peer_endpoints)
1013
- [`SOMEGUY_IPNS_ENDPOINTS`](#someguy_ipns_endpoints)
14+
- [`SOMEGUY_HTTP_BLOCK_PROVIDER_ENDPOINTS`](#someguy_http_block_provider_endpoints)
15+
- [`SOMEGUY_HTTP_BLOCK_PROVIDER_PEERIDS`](#someguy_http_block_provider_peerids)
1116
- [`SOMEGUY_LIBP2P_LISTEN_ADDRS`](#someguy_libp2p_listen_addrs)
1217
- [`SOMEGUY_LIBP2P_CONNMGR_LOW`](#someguy_libp2p_connmgr_low)
1318
- [`SOMEGUY_LIBP2P_CONNMGR_HIGH`](#someguy_libp2p_connmgr_high)
@@ -20,8 +25,8 @@
2025
- [`GOLOG_FILE`](#golog_file)
2126
- [`GOLOG_TRACING_FILE`](#golog_tracing_file)
2227
- [Tracing](#tracing)
23-
- [`SOMEGUY_SAMPLING_FRACTION`](#someguy_sampling_fraction)
2428
- [`SOMEGUY_TRACING_AUTH`](#someguy_tracing_auth)
29+
- [`SOMEGUY_SAMPLING_FRACTION`](#someguy_sampling_fraction)
2530

2631
## Configuration
2732

@@ -31,11 +36,11 @@ The address to listen on.
3136

3237
Default: `127.0.0.1:8190`
3338

34-
### `SOMEGUY_ACCELERATED_DHT`
39+
### `SOMEGUY_DHT`
3540

36-
Whether or not the Accelerated DHT is enabled or not.
41+
Controls DHT client mode: `standard`, `accelerated`, `disabled`
3742

38-
Default: `true`
43+
Default: `accelerated`
3944

4045
### `SOMEGUY_CACHED_ADDR_BOOK`
4146

@@ -73,6 +78,20 @@ Comma-separated list of other Delegated Routing V1 endpoints to proxy IPNS reque
7378

7479
Default: none
7580

81+
### `SOMEGUY_HTTP_BLOCK_PROVIDER_ENDPOINTS`
82+
83+
Comma-separated list of [HTTP trustless gateway](https://specs.ipfs.tech/http-gateways/trustless-gateway/) for probing and generating synthetic provider records.
84+
85+
When the configured gateway responds with HTTP 200 to an HTTP HEAD request for a block (`HEAD /ipfs/{cid}?format=raw`), `FindProviders` returns a provider record containing a PeerID from `SOMEGUY_HTTP_BLOCK_PROVIDER_PEERIDS` and the HTTP gateway endpoint as a multiaddr with `/tls/http` suffix.
86+
87+
Default: none
88+
89+
### `SOMEGUY_HTTP_BLOCK_PROVIDER_PEERIDS`
90+
91+
Comma-separated list of [multibase-encoded peerIDs](https://github.com/libp2p/specs/blob/master/peer-ids/peer-ids.md#string-representation) to use in synthetic provider records returned for HTTP providers in `SOMEGUY_HTTP_BLOCK_PROVIDER_ENDPOINTS`.
92+
93+
Default: none
94+
7695
### `SOMEGUY_LIBP2P_LISTEN_ADDRS`
7796

7897
Multiaddresses for libp2p host to listen on (comma-separated).

http_block_router.go

Lines changed: 157 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,157 @@
1+
package main
2+
3+
import (
4+
"context"
5+
"crypto/tls"
6+
"fmt"
7+
"net"
8+
"net/http"
9+
"net/url"
10+
"os"
11+
"strconv"
12+
"strings"
13+
"time"
14+
15+
drclient "github.com/ipfs/boxo/routing/http/client"
16+
"github.com/ipfs/boxo/routing/http/types"
17+
"github.com/ipfs/boxo/routing/http/types/iter"
18+
"github.com/ipfs/go-cid"
19+
"github.com/libp2p/go-libp2p/core/peer"
20+
"github.com/multiformats/go-multiaddr"
21+
)
22+
23+
type httpBlockRouter struct {
24+
endpoint string
25+
endpointMa multiaddr.Multiaddr
26+
peerID peer.ID
27+
httpClient *http.Client
28+
}
29+
30+
const httpBlockRouterTimeout = 5 * time.Second
31+
32+
// newHTTPBlockRouter returns a router backed by a trustless HTTP gateway
33+
// (https://specs.ipfs.tech/http-gateways/trustless-gateway/) at the specified
34+
// endpoint. If gateway responds to HTTP 200 to HTTP HEAD request, the
35+
// FindProviders returns a provider record with predefined peerID and gateway
36+
// URL represented as multiaddr with /tls/http suffic.
37+
func newHTTPBlockRouter(endpoint string, p peer.ID, client *http.Client) (httpBlockRouter, error) {
38+
if client == nil {
39+
client = defaultHTTPBlockRouterClient(false)
40+
}
41+
if client.Timeout == 0 {
42+
client.Timeout = httpBlockRouterTimeout
43+
}
44+
45+
u, err := url.Parse(endpoint)
46+
if err != nil {
47+
return httpBlockRouter{}, fmt.Errorf("failed to parse endpoint %s: %w", endpoint, err)
48+
}
49+
if u.Scheme != "http" && u.Scheme != "https" {
50+
return httpBlockRouter{}, fmt.Errorf("unsupported scheme %s, only http and https are supported", u.Scheme)
51+
}
52+
53+
h := u.Hostname()
54+
ip := net.ParseIP(h)
55+
var hostComponent string
56+
if ip == nil {
57+
hostComponent = "dns"
58+
} else if strings.Contains(h, ":") {
59+
hostComponent = "ip6"
60+
} else {
61+
hostComponent = "ip4"
62+
}
63+
64+
var port int
65+
if u.Port() != "" {
66+
if p, err := strconv.Atoi(u.Port()); err != nil {
67+
return httpBlockRouter{}, fmt.Errorf("invalid port %s: %w", u.Port(), err)
68+
} else {
69+
port = p
70+
}
71+
} else {
72+
if u.Scheme == "https" {
73+
port = 443
74+
} else {
75+
port = 80
76+
}
77+
}
78+
79+
var tlsComponent string
80+
if u.Scheme == "https" {
81+
tlsComponent = "/tls"
82+
} else if os.Getenv("DEBUG") == "true" {
83+
// allow unencrypted HTTP for local debugging
84+
tlsComponent = ""
85+
} else {
86+
return httpBlockRouter{}, fmt.Errorf("failed to parse endpoint %s: only HTTPS providers are allowed (unencrypted HTTP can't be used in web browsers)", endpoint)
87+
88+
}
89+
90+
var httpPathComponent string
91+
if escPath := u.EscapedPath(); escPath != "" && escPath != "/" {
92+
return httpBlockRouter{}, fmt.Errorf("failed to parse endpoint %s: only URLs without path are supported", endpoint)
93+
}
94+
95+
endpointMaStr := fmt.Sprintf("/%s/%s/tcp/%d%s/http%s", hostComponent, h, port, tlsComponent, httpPathComponent)
96+
97+
ma, err := multiaddr.NewMultiaddr(endpointMaStr)
98+
if err != nil {
99+
return httpBlockRouter{}, fmt.Errorf("failed to parse endpoint %s: %w", endpoint, err)
100+
}
101+
return httpBlockRouter{
102+
endpoint: endpoint,
103+
endpointMa: ma,
104+
peerID: p,
105+
httpClient: client,
106+
}, nil
107+
}
108+
109+
func defaultHTTPBlockRouterClient(insecureSkipVerify bool) *http.Client {
110+
transport := http.DefaultTransport
111+
if insecureSkipVerify {
112+
transport = &http.Transport{
113+
TLSClientConfig: &tls.Config{
114+
InsecureSkipVerify: true, // Disable TLS cert validation for tests
115+
},
116+
}
117+
}
118+
return &http.Client{
119+
Timeout: httpBlockRouterTimeout, // timeout hanging HTTP HEAD sooner than boxo/routing/http/server.DefaultRoutingTimeout
120+
Transport: &drclient.ResponseBodyLimitedTransport{
121+
RoundTripper: transport,
122+
LimitBytes: 1 << 12, // max 4KiB -- should be plenty for HEAD response
123+
UserAgent: "someguy/" + buildVersion(),
124+
},
125+
}
126+
}
127+
128+
func (h httpBlockRouter) FindProviders(ctx context.Context, c cid.Cid, limit int) (iter.ResultIter[types.Record], error) {
129+
req, err := http.NewRequestWithContext(ctx, "HEAD", fmt.Sprintf("%s/ipfs/%s?format=raw", h.endpoint, c), nil)
130+
if err != nil {
131+
return nil, err
132+
}
133+
req.Header.Set("Accept", "application/vnd.ipld.raw")
134+
httpClient := h.httpClient
135+
if httpClient == nil {
136+
httpClient = http.DefaultClient
137+
}
138+
139+
resp, err := httpClient.Do(req)
140+
if err == nil && resp.StatusCode == http.StatusOK {
141+
return iter.ToResultIter(iter.FromSlice([]types.Record{
142+
&types.PeerRecord{
143+
Schema: types.SchemaPeer,
144+
ID: &h.peerID,
145+
Addrs: []types.Multiaddr{
146+
{Multiaddr: h.endpointMa},
147+
},
148+
Protocols: []string{"transport-ipfs-gateway-http"},
149+
Extra: nil,
150+
},
151+
})), nil
152+
}
153+
// everything that is not HTTP 200, including errors, produces empty response
154+
return iter.ToResultIter(iter.FromSlice([]types.Record{})), nil
155+
}
156+
157+
var _ providersRouter = (*httpBlockRouter)(nil)

http_block_router_test.go

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
package main
2+
3+
import (
4+
"context"
5+
"fmt"
6+
"net"
7+
"net/http"
8+
"net/http/httptest"
9+
"net/url"
10+
"os"
11+
"strings"
12+
"testing"
13+
14+
"github.com/ipfs/boxo/routing/http/types"
15+
"github.com/ipfs/boxo/routing/http/types/iter"
16+
"github.com/ipfs/go-cid"
17+
"github.com/libp2p/go-libp2p/core/peer"
18+
"github.com/stretchr/testify/assert"
19+
"github.com/stretchr/testify/require"
20+
)
21+
22+
func TestHTTPBlockRouter(t *testing.T) {
23+
t.Parallel()
24+
debug := os.Getenv("DEBUG") == "true"
25+
26+
t.Run("FindProviders", func(t *testing.T) {
27+
ctx := context.Background()
28+
// Set up mock HTTP Provider (trustless gateway) that returns HTTP 200 for specific CID
29+
testData := "Thu 8 May 01:07:03 CEST 2025"
30+
testCid := cid.MustParse("bafkreie5zycmytdhd5bl4f5jqsayyiwshugf57d4hkd7eif3toh23fsy3i")
31+
httpBlockGateway := newMockTrustlessGateway(testCid, testData, debug)
32+
t.Cleanup(func() { httpBlockGateway.Close() })
33+
34+
// Test args
35+
endpoint := httpBlockGateway.URL
36+
peerId, _ := peer.Decode("12D3KooWCjfPiojcCUmv78Wd1NJzi4Mraj1moxigp7AfQVQvGLwH")
37+
insecureSkipVerify := true
38+
client := defaultHTTPBlockRouterClient(insecureSkipVerify)
39+
httpHost, httpPort, err := splitHostPort(endpoint)
40+
assert.NoError(t, err)
41+
expectedAddr := fmt.Sprintf("/ip4/%s/tcp/%s/tls/http", httpHost, httpPort)
42+
43+
// Create Router
44+
httpBlockRouter, err := newHTTPBlockRouter(endpoint, peerId, client)
45+
assert.NoError(t, err)
46+
47+
t.Run("return gateway as HTTP provider if HTTP HEAD check returned HTTP 200", func(t *testing.T) {
48+
t.Parallel()
49+
50+
// Ask Router for CID present on trustless gateway
51+
it, err := httpBlockRouter.FindProviders(ctx, testCid, 10)
52+
require.NoError(t, err)
53+
54+
results, err := iter.ReadAllResults(it)
55+
require.NoError(t, err)
56+
require.Len(t, results, 1)
57+
58+
// Verify returned provider points at http gateway URL
59+
peerRecord := results[0].(*types.PeerRecord)
60+
require.Equal(t, peerId, *peerRecord.ID)
61+
require.Len(t, peerRecord.Addrs, 1)
62+
assert.NoError(t, err)
63+
require.Equal(t, expectedAddr, peerRecord.Addrs[0].String())
64+
})
65+
66+
t.Run("return no results if HTTP HEAD check returned HTTP 404", func(t *testing.T) {
67+
t.Parallel()
68+
69+
// This CID has no providers
70+
failCid := cid.MustParse("bafkreie5keu4z5kgutjds5tz3ahdxhcdkn4hl2vr7snenml44ui7y4yfki")
71+
72+
// Ask Router for CID present on trustless gateway
73+
it, err := httpBlockRouter.FindProviders(ctx, failCid, 10)
74+
require.NoError(t, err)
75+
76+
results, err := iter.ReadAllResults(it)
77+
require.NoError(t, err)
78+
require.Len(t, results, 0)
79+
})
80+
81+
})
82+
}
83+
84+
// newMockTrustlessGateway pretends to be http provider that supports
85+
// block response https://specs.ipfs.tech/http-gateways/trustless-gateway/#block-responses-application-vnd-ipld-raw
86+
func newMockTrustlessGateway(c cid.Cid, body string, debug bool) *httptest.Server {
87+
expectedPathPrefix := "/ipfs/" + c.String()
88+
handler := http.HandlerFunc(func(w http.ResponseWriter, req *http.Request) {
89+
if debug {
90+
fmt.Printf("mockTrustlessGateway %s %s\n", req.Method, req.URL.Path)
91+
}
92+
if strings.HasPrefix(req.URL.Path, expectedPathPrefix) {
93+
w.Header().Set("Content-Type", "application/vnd.ipld.raw")
94+
w.WriteHeader(http.StatusOK)
95+
if req.Method == "GET" {
96+
_, err := w.Write([]byte(body))
97+
if err != nil {
98+
fmt.Fprintf(os.Stderr, "mockTrustlessGateway %s %s error: %v\n", req.Method, req.URL.Path, err)
99+
}
100+
}
101+
return
102+
} else if strings.HasPrefix(req.URL.Path, "/ipfs/bafkqaaa") {
103+
// This is probe from https://specs.ipfs.tech/http-gateways/trustless-gateway/#dedicated-probe-paths
104+
w.Header().Set("Content-Type", "application/vnd.ipld.raw")
105+
w.WriteHeader(http.StatusOK)
106+
return
107+
} else {
108+
http.Error(w, "Not Found", http.StatusNotFound)
109+
return
110+
}
111+
})
112+
113+
// Make it HTTP/2 with self-signed TLS cert
114+
srv := httptest.NewUnstartedServer(handler)
115+
srv.EnableHTTP2 = true
116+
srv.StartTLS()
117+
return srv
118+
}
119+
120+
func splitHostPort(httpUrl string) (ipAddr string, port string, err error) {
121+
u, err := url.Parse(httpUrl)
122+
if err != nil {
123+
return "", "", err
124+
}
125+
if u.Scheme == "" || u.Host == "" {
126+
return "", "", fmt.Errorf("invalid URL format: missing scheme or host")
127+
}
128+
ipAddr, port, err = net.SplitHostPort(u.Host)
129+
if err != nil {
130+
return "", "", fmt.Errorf("failed to split host and port from %q: %w", u.Host, err)
131+
}
132+
return ipAddr, port, nil
133+
}

0 commit comments

Comments
 (0)