Skip to content

Conversation

@Revolyssup
Copy link
Contributor

@Revolyssup Revolyssup commented Sep 9, 2025

Goals:

  • Provide an API endpoint for debugging.
  • Provide an API that generates translated ADC configurations based on the latest Gateway API/Ingress/CRD resources for inspection and debugging, outputting them in JSON.
  • Provide a homepage guide. Use pure HTML to create a simple title and link list.

Solution:

  • Uses the underlying store that adc uses to sync data as the store for debug handler.
  • Create a new runnable Server that registers handlers on it and runs the server.
Screencast.From.2025-09-12.15-01-20.mp4
  • Bugfix
  • New feature provided
  • Improve performance
  • Backport patches
  • Documentation
  • Refactor
  • Chore
  • CI/CD or Tests

What this PR does / why we need it:

Pre-submission checklist:

  • Did you explain what problem does this PR solve? Or what new features have been added?
  • Have you added corresponding test cases?
  • Have you modified the corresponding document?
  • Is this PR backward compatible? If it is not backward compatible, please discuss on the mailing list first

@Revolyssup Revolyssup marked this pull request as draft September 9, 2025 18:18
syncCh: make(chan struct{}, 1),
client: cli,
// TODO: Maybe pass port/address from external configuration
adcdebugserver: common.NewADCDebugServer(cli.Store, cli.ConfigManager, 8432),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it should be needed to make this port configurable. Maybe we can hardcode it to some value.

@Revolyssup Revolyssup marked this pull request as ready for review September 9, 2025 19:42
@AlinsRan
Copy link
Contributor

AlinsRan commented Sep 10, 2025

I don't think it should be a standalone server, rather than a part of the ingress controller API.

@AlinsRan
Copy link
Contributor

AlinsRan commented Sep 12, 2025

I don't think it should be a standalone server, rather than a part of the ingress controller API.

Suggestion: Move the debug API into the unified Ingress Server

In the current implementation, the debug HTTP server is started directly inside the provider.
While this works, it introduces several issues:

  • Tight coupling of responsibilities: The provider’s job is to synchronize resources, not to start its own HTTP server.
  • Lack of configurability: Port, paths, authentication, etc. are “black-boxed” inside the provider and cannot be configured or extended externally.
  • Harder to extend: If other components later need to expose debug endpoints, you will end up with multiple scattered servers.

A cleaner approach is to:

  • Start a unified api server at the Ingress Server layer, e.g.:
  srv := NewServer(":9090")
  srv.Register(&addcDebugAPI{})
  srv.Register(&systemAPI{})
  mgr.Add(srv)
  • Let each component (including the provider) only implement a RegisterHandlers/DebugProvider interface to attach its handlers, without starting its own server.
  • This way, port, authentication, middleware, etc. can be configured in one place, and the server lifecycle follows the manager.

@Revolyssup
Copy link
Contributor Author

I don't think it should be a standalone server, rather than a part of the ingress controller API.

Suggestion: Move the debug API into the unified Ingress Server

In the current implementation, the debug HTTP server is started directly inside the provider. While this works, it introduces several issues:

  • Tight coupling of responsibilities: The provider’s job is to synchronize resources, not to start its own HTTP server.
  • Lack of configurability: Port, paths, authentication, etc. are “black-boxed” inside the provider and cannot be configured or extended externally.
  • Harder to extend: If other components later need to expose debug endpoints, you will end up with multiple scattered servers.

A cleaner approach is to:

  • Start a unified api server at the Ingress Server layer, e.g.:
  srv := NewServer(":9090")
  srv.Register(&addcDebugAPI{})
  srv.Register(&systemAPI{})
  mgr.Add(srv)
  • Let each component (including the provider) only implement a RegisterHandlers/DebugProvider interface to attach its handlers, without starting its own server.
  • This way, port, authentication, middleware, etc. can be configured in one place, and the server lifecycle follows the manager.

done

@Revolyssup Revolyssup changed the title feat: add adc debug server feat: add unified API server with debugging capabilities Sep 12, 2025
Comment on lines +51 to +58
func (s *Server) Register(pathPrefix string, registrant provider.RegisterHandler) {
subMux := http.NewServeMux()
registrant.Register(pathPrefix, subMux)
s.mux.Handle(pathPrefix+"/", http.StripPrefix(pathPrefix, subMux))
s.mux.HandleFunc(pathPrefix, func(w http.ResponseWriter, r *http.Request) {
http.Redirect(w, r, pathPrefix+"/", http.StatusPermanentRedirect)
})
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The debug server currently has no authentication mechanism, making it accessible to anyone who can reach the endpoint. Should we consider incorporating certain authentication mechanisms?

Copy link
Contributor Author

@Revolyssup Revolyssup Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original requirement was to return simple UI friendly HTML. Adding authentication will mean adding another page for login. Maybe we can use cookie based authentication and add one more simple login html page.

  1. Check if cookie has INGRESS-DEBUG_TOKEN. If present then serve normally.
  2. If not present then redirect to login page. This will also be a simple HTML page with just input box to enter debug token. This debug token will be set in the static configuration for ingress and can be copied from there and pasted here. User clicks login and this value be just stored in the cookie if matches and user will be logged in.

@bzp2010 @ronething @AlinsRan What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's enough not to expose the port, just like the control-api.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's enough not to expose the port, just like the control-api.

It might be a bit different. This debug server has exposed SSL/Consumer resources, but the control-api does not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be a bit different. This debug server has exposed SSL/Consumer resources, but the control-api does not.

You're right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only k8s native way to enable/disable this dynamically will be to use some existing CR for it. But that might be an overkill. I think we should opt for one of the two ways:

  1. Just disable this server by default and have it enabled statically only and not dynamically. It's a debug server, I don't know if dynamically enabling/disabling it is good enough use case. This is the simplest and safest.
  2. Just have a single login page that expects a statically defined key as recommended above.

If it's okay to enable/disable this statically then I recommend the first way is good enough or else it will be overengineering. Later if use case emerges about dynamically enabling/disabling, we can spend some thought here.

Copy link
Contributor

@bzp2010 bzp2010 Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's exactly what I was thinking. A simple static config.

However, I must confirm whether the restart required to enable this feature would disrupt the current "state". Specifically, can we guarantee consistent results based on unchanged configurations? We need to ensure that the restart does not corrupt any potentially existing in-memory state, thereby preventing the debugging of previously existing issues.

For example, we sometimes encounter problems that resolve after a restart, making it impossible to analyze the root cause.

Can anyone answer this question?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s not possible to guarantee the preservation of “state” across a restart.
The controller’s declarative configuration state (stored in Kubernetes/etcd via CRDs) is automatically rebuilt after a restart, but any runtime in-memory state (caches, queues, goroutine stacks, etc.) will inevitably be lost.

For an ingress controller, a restart is typically the last-resort mechanism to recover state, so if you need to debug live in-memory data, it will naturally no longer be available after restarting.

At present, for security and simplicity reasons, we recommend keeping the debug API disabled by default and enabling it only via static configuration.
If there is a strong need to inspect runtime state in the future, we can revisit supporting dynamic enablement or enabling it up front with proper authentication.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. added the boolean in static config.

Copy link
Contributor

@bzp2010 bzp2010 Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The state itself isn't critical, but it would be preferable if you could confirm it doesn't impact configuration generation.

The key point is For a given input of Kubernetes resources, always produce a fixed output. This includes how to handle conflicting resources (determine the priority), how to sort resource lists, and so on. This process already exists within AIC, such as during full synchronization at startup. We can ignore those changes triggered by reconciliation tasks.
If these could be leveraged for debugging functionality, that would be excellent.

metrics_addr: ":8080" # The address the metrics endpoint binds to.
# The default value is ":8080".
enable_server: false # The debug API is behind this server which is disabled by default for security reasons.
server_addr: ":9092"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

127.0.0.1?
Should we emphasize the purpose of debugging on the fields?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

binded to localhost.

Copy link
Contributor Author

@Revolyssup Revolyssup Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose is generic. Although currently it's to debug the in memory state of adc config. Maybe we can add a list in the comment.

  1. /debug can be used to debug in-memory state of translated adc configs to be synced with data plane.
    ....later if handlers are added, we can extend the list

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@AlinsRan AlinsRan requested a review from Copilot September 17, 2025 01:35
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds an ADC debug server with REST API endpoints to provide debugging capabilities for APISIX Ingress Controller configurations. The server exposes translated ADC configurations in JSON format for inspection and troubleshooting.

  • Creates a new unified API server with debugging endpoints
  • Implements HTML-based navigation interface for browsing ADC configurations
  • Integrates debug server as an optional component in the controller manager

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
internal/provider/register.go Adds RegisterHandler interface for HTTP handler registration
internal/provider/provider.go Extends Provider interface to embed RegisterHandler
internal/provider/common/adcdebugserver.go Implements ADCDebugProvider with HTML templates and JSON endpoints
internal/provider/apisix/provider.go Implements Register method to setup debug handlers
internal/manager/server/server.go Creates new HTTP server with handler registration capabilities
internal/manager/run.go Integrates debug server into controller manager
internal/controller/config/types.go Adds server configuration fields
internal/controller/config/config.go Sets default server address in configuration
internal/adc/client/client.go Initializes ADCDebugProvider with shared store and config manager
config/samples/config.yaml Documents server configuration options

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

</html>
`)

_ = tmpl.Execute(w, struct {
Copy link

Copilot AI Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error from template execution is being ignored. This could lead to silent failures when rendering HTML responses. Consider handling the error appropriately, either by logging it or returning an HTTP error response.

Copilot uses AI. Check for mistakes.
</html>
`)

_ = tmpl.Execute(w, struct {
Copy link

Copilot AI Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Template execution errors are being ignored in multiple locations. This could lead to silent failures when rendering HTML responses. Consider handling these errors appropriately, either by logging them or returning HTTP error responses.

Copilot uses AI. Check for mistakes.
</html>
`)

_ = tmpl.Execute(w, struct {
Copy link

Copilot AI Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Template execution errors are being ignored in multiple locations. This could lead to silent failures when rendering HTML responses. Consider handling these errors appropriately, either by logging them or returning HTTP error responses.

Copilot uses AI. Check for mistakes.
</html>
`)

_ = tmpl.Execute(w, struct {
Copy link

Copilot AI Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Template execution errors are being ignored in multiple locations. This could lead to silent failures when rendering HTML responses. Consider handling these errors appropriately, either by logging them or returning HTTP error responses.

Copilot uses AI. Check for mistakes.
asrv.pathPrefix, configNameEncoded, url.QueryEscape(resourceType), url.QueryEscape(svc.ID)),
})
}
case "consumers":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid using hard coding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const (
TypeRoute = "route"
TypeService = "service"
TypeConsumer = "consumer"
TypeSSL = "ssl"
TypeGlobalRule = "global_rule"
TypePluginMetadata = "plugin_metadata"
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced with adctypes

@AlinsRan AlinsRan requested a review from ronething September 17, 2025 09:38
Copy link
Contributor

@ronething ronething left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

@Revolyssup Revolyssup merged commit b47ed04 into apache:master Sep 18, 2025
24 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants