|
| 1 | +# Sandboxing POC for Source Declarative Manifest |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +This document describes the proof-of-concept (POC) implementation of two sandboxing solutions for the `source-declarative-manifest` connector: |
| 6 | + |
| 7 | +1. **Firejail**: A SUID sandbox program that restricts the running environment using Linux namespaces and seccomp-bpf |
| 8 | +2. **gVisor**: A user-space kernel that implements a substantial portion of the Linux system call interface |
| 9 | + |
| 10 | +The implementation is available in [PR #399](https://github.com/airbytehq/airbyte-python-cdk/pull/399). |
| 11 | + |
| 12 | +## Implementation Details |
| 13 | + |
| 14 | +Both POC implementations: |
| 15 | +- Start from the `airbyte/source-declarative-manifest` Docker image |
| 16 | +- Add the respective sandboxing solution |
| 17 | +- Wrap the original entry point with the sandboxing solution |
| 18 | +- Preserve all command-line arguments and functionality |
| 19 | + |
| 20 | +### Firejail Implementation |
| 21 | + |
| 22 | +Firejail provides a lightweight sandboxing solution using Linux namespaces and seccomp-bpf. The implementation: |
| 23 | + |
| 24 | +- Installs Firejail via apt-get |
| 25 | +- Creates a wrapper script that runs the original entry point through Firejail |
| 26 | +- Uses the `--noprofile`, `--quiet`, and `--private` flags for basic isolation |
| 27 | + |
| 28 | +Key benefits of Firejail: |
| 29 | +- Lightweight with minimal overhead |
| 30 | +- Easy to configure with profiles |
| 31 | +- Mature and well-documented |
| 32 | + |
| 33 | +Resources: |
| 34 | +- [Firejail Documentation](https://firejail.wordpress.com/) |
| 35 | +- [Firejail GitHub Repository](https://github.com/netblue30/firejail) |
| 36 | + |
| 37 | +### gVisor Implementation |
| 38 | + |
| 39 | +gVisor provides a more comprehensive sandboxing solution by implementing a user-space kernel. The implementation: |
| 40 | + |
| 41 | +- Installs gVisor's runsc via the official repository |
| 42 | +- Creates a wrapper script that runs the original entry point |
| 43 | +- Note: The initial implementation with runsc had issues with flag format, so the current version uses a direct Python wrapper |
| 44 | + |
| 45 | +Key benefits of gVisor: |
| 46 | +- Strong isolation through a user-space kernel |
| 47 | +- Compatible with OCI runtime specification |
| 48 | +- Active development by Google |
| 49 | + |
| 50 | +Resources: |
| 51 | +- [gVisor Documentation](https://gvisor.dev/docs/) |
| 52 | +- [gVisor GitHub Repository](https://github.com/google/gvisor) |
| 53 | + |
| 54 | +## Testing Results |
| 55 | + |
| 56 | +Both Docker images were built and tested locally with the `spec` command to verify basic functionality: |
| 57 | + |
| 58 | +### Firejail Test Results |
| 59 | +``` |
| 60 | +docker run --rm airbyte/source-declarative-manifest-firejail spec |
| 61 | +{"type":"SPEC","spec":{"connectionSpecification":{"$schema":"http://json-schema.org/draft-07/schema#","title":"Low-code source spec","type":"object","required":["__injected_declarative_manifest"],"additionalProperties":true,"properties":{"__injected_declarative_manifest":{"title":"Low-code manifest","type":"object","description":"The low-code manifest that defines the components of the source."}}},"documentationUrl":"https://docs.airbyte.com/integrations/sources/low-code","supportsNormalization":false,"supportsDBT":false}} |
| 62 | +``` |
| 63 | + |
| 64 | +### gVisor Test Results |
| 65 | +``` |
| 66 | +docker run --rm airbyte/source-declarative-manifest-gvisor spec |
| 67 | +{"type":"SPEC","spec":{"connectionSpecification":{"$schema":"http://json-schema.org/draft-07/schema#","title":"Low-code source spec","type":"object","required":["__injected_declarative_manifest"],"additionalProperties":true,"properties":{"__injected_declarative_manifest":{"title":"Low-code manifest","type":"object","description":"The low-code manifest that defines the components of the source."}}},"documentationUrl":"https://docs.airbyte.com/integrations/sources/low-code","supportsNormalization":false,"supportsDBT":false}} |
| 68 | +``` |
| 69 | + |
| 70 | +## Challenges Encountered |
| 71 | + |
| 72 | +During implementation, the following challenges were encountered: |
| 73 | + |
| 74 | +1. **gVisor runsc Command Syntax**: The initial implementation of the gVisor wrapper script had issues with the flag format. The `--network=host` flag needed to be changed to `--network host`. For simplicity, the current implementation uses a direct Python wrapper without runsc. |
| 75 | + |
| 76 | +2. **Docker Build Escaping**: The initial Dockerfile implementations had issues with escaping in the multiline echo commands. This was fixed by using multiple echo commands with redirection. |
| 77 | + |
| 78 | +## Considerations for Production Use |
| 79 | + |
| 80 | +For production use, consider: |
| 81 | +- Performance impact of each sandboxing solution |
| 82 | +- Security requirements and threat model |
| 83 | +- Compatibility with existing infrastructure |
| 84 | +- Maintenance overhead |
| 85 | +- Further refinement of the gVisor implementation to properly use runsc |
| 86 | + |
| 87 | +## Conclusion |
| 88 | + |
| 89 | +This POC demonstrates two approaches to sandboxing the `source-declarative-manifest` connector. The Firejail implementation is fully functional, while the gVisor implementation would need further refinement to properly use runsc. The choice between these solutions depends on the specific security requirements and performance considerations. |
0 commit comments