-
Notifications
You must be signed in to change notification settings - Fork 0
Troubleshooting
- Introduction
- Common Installation Issues
- Configuration Problems
- Device Communication Issues
- Metadata Service Connectivity
- Storage Backend Issues
- Error Code Reference
- Diagnostic Tools and Logging
- Performance Troubleshooting
- Support Resources
The Post-Quantum WebAuthn Platform is a sophisticated authentication system that combines traditional WebAuthn protocols with post-quantum cryptographic algorithms. This troubleshooting guide addresses common issues encountered during installation, configuration, and operation of the platform.
The platform consists of several key components:
- WebAuthn Server: Flask-based web server handling authentication requests
- HID Layer: Hardware interface for USB/FIDO devices
- Metadata Service: FIDO Alliance Metadata Service integration
- Storage Backends: Local and cloud storage for credentials
- Post-Quantum Cryptography: liboqs integration for quantum-resistant algorithms
Problem: The platform fails to initialize due to liboqs loading issues.
Symptoms:
- ImportError: "oqs bindings are unavailable"
- Application startup failures
- Quantum-resistant algorithm unavailability
Diagnosis Steps:
- Check liboqs installation:
python -c "import oqs; print(oqs.get_enabled_sig_mechanisms())"- Verify library paths:
ldd prebuilt_liboqs/linux-x86_64/lib/liboqs.so- Check environment variables:
echo $LD_LIBRARY_PATH
echo $PYTHONPATHSolutions:
- Missing Dependencies: Install required system libraries:
sudo apt-get install build-essential cmake pkg-config- Library Path Issues: Set appropriate environment variables:
export LD_LIBRARY_PATH=/path/to/liboqs:$LD_LIBRARY_PATH
export PYTHONPATH=/path/to/python/modules:$PYTHONPATH- Version Compatibility: Ensure liboqs version matches requirements:
python -c "import oqs; print(oqs.version())"Section sources
- pqc.py
- prebuilt_liboqs/linux-x86_64/include/oqs/common.h
Problem: Incorrect or missing environment variables causing startup failures.
Common Variables:
-
FIDO_SERVER_SECRET_KEY: Session encryption key -
FIDO_SERVER_RP_ID: Relying Party identifier -
FIDO_SERVER_GCS_BUCKET: Google Cloud Storage bucket -
FIDO_SERVER_GCS_CREDENTIALS_FILE: Service account credentials
Diagnosis:
# Check all FIDO-related environment variables
env | grep FIDO_SERVER_
# Verify specific variables
echo $FIDO_SERVER_SECRET_KEY
echo $FIDO_SERVER_RP_IDSolutions:
- Secret Key Generation:
# Generate secure secret key
openssl rand -hex 32 > secret.key
export FIDO_SERVER_SECRET_KEY_FILE=$(pwd)/secret.key- RP ID Configuration:
# Set relying party identifier
export FIDO_SERVER_RP_ID="your-domain.com"Section sources
- config.py
Problem: Misconfigured environment variables leading to operational failures.
Common Issues:
- Invalid secret key format
- Malformed RP ID
- Incorrect storage backend configuration
Diagnostic Commands:
# Test secret key configuration
python -c "
from server.server.config import _resolve_secret_key
print('Secret key resolution successful')
"
# Validate RP ID
python -c "
from server.server.config import determine_rp_id
print(determine_rp_id())
"Configuration Validation:
# Example configuration validation
import os
from server.server.config import app
# Check required environment variables
required_vars = ['FIDO_SERVER_SECRET_KEY', 'FIDO_SERVER_RP_ID']
for var in required_vars:
if not os.environ.get(var):
print(f"Missing required environment variable: {var}")
# Validate storage configuration
if os.environ.get('FIDO_SERVER_GCS_BUCKET'):
print("Google Cloud Storage enabled")
else:
print("Using local storage")Section sources
- config.py
Problem: Unable to connect to storage backends (local or cloud).
Local Storage Issues:
- Permission denied errors
- Disk space limitations
- Directory creation failures
Cloud Storage Issues:
- Authentication failures
- Network connectivity problems
- Bucket access permissions
Diagnostic Steps:
- Local Storage Test:
# Test directory permissions
mkdir -p server/server/session-credentials/test
touch server/server/session-credentials/test/testfile
rm server/server/session-credentials/test/testfile
rmdir server/server/session-credentials/test- Cloud Storage Test:
# Test GCS connectivity
python -c "
from server.server.cloud_storage import ensure_ready
ensure_ready()
print('Cloud storage ready')
"Solutions:
- Permission Issues:
# Fix directory permissions
chmod 755 server/server/session-credentials
chmod 644 server/server/session-credentials/*- Network Connectivity:
# Test network connectivity
curl -I https://storage.googleapis.com
ping storage.googleapis.comSection sources
- storage.py
- cloud_storage.py
Problem: FIDO/HID devices not detected or communication failures.
Symptoms:
- "No FIDO devices found" errors
- Device enumeration failures
- Communication timeouts
HID Layer Architecture:
classDiagram
class CtapDevice {
+capabilities : int
+call(cmd, data, event, on_keepalive) bytes
+close() void
+list_devices() Iterator~CtapDevice~
}
class CtapHidConnection {
+read_packet() bytes
+write_packet(data) void
+close() void
}
class FileCtapHidConnection {
+handle : int
+descriptor : HidDescriptor
+read_packet() bytes
+write_packet(data) void
+close() void
}
class HidDescriptor {
+path : str
+vid : int
+pid : int
+report_size_in : int
+report_size_out : int
+product_name : str
+serial_number : str
}
CtapDevice --> CtapHidConnection : uses
CtapHidConnection <|-- FileCtapHidConnection : implements
FileCtapHidConnection --> HidDescriptor : manages
Diagram sources
- base.py
Diagnostic Commands:
# List USB devices
lsusb
# Check HID devices
ls /dev/hidraw*
# Test device access
cat /dev/hidraw0 | hexdump -C | head -10Solutions:
- Device Permissions:
# Add user to dialout group
sudo usermod -a -G dialout $USER
# Set device permissions
sudo chmod 660 /dev/hidraw*- Device Enumeration:
# Test device discovery
from fido2.hid import CtapHidDevice
devices = list(CtapHidDevice.list_devices())
print(f"Found {len(devices)} devices")Section sources
- base.py
Problem: CTAP2 protocol errors during device communication.
Common CTAP2 Error Codes:
| Error Code | Description | Solution |
|---|---|---|
| 0x01 | INVALID_COMMAND | Check command format and parameters |
| 0x02 | INVALID_PARAMETER | Validate input parameters |
| 0x03 | INVALID_LENGTH | Verify data length constraints |
| 0x05 | TIMEOUT | Increase timeout values |
| 0x21 | PROCESSING | Wait for device processing |
| 0x2F | USER_ACTION_TIMEOUT | Reduce user interaction timeout |
| 0x31 | PIN_INVALID | Reset PIN or use correct PIN |
| 0x35 | PIN_NOT_SET | Set PIN before use |
Error Handling Implementation:
from fido2.ctap import CtapError
try:
# Device communication
response = device.call(command, data)
except CtapError as e:
if e.code == CtapError.ERR.TIMEOUT:
# Handle timeout - increase timeout or retry
pass
elif e.code == CtapError.ERR.PIN_INVALID:
# Handle PIN issues
pass
else:
# Log unknown error
logger.error(f"CTAP error: {e}")Section sources
- ctap.py
Problem: FIDO Metadata Service (MDS) connectivity and validation issues.
Metadata Service Architecture:
sequenceDiagram
participant Client as WebAuthn Client
participant Server as WebAuthn Server
participant MDS as FIDO MDS Service
participant Verifier as Metadata Verifier
Client->>Server : Register/Authenticate Request
Server->>MDS : Fetch Metadata Blob
MDS-->>Server : Metadata Blob + Signature
Server->>Verifier : Validate Metadata
Verifier->>Verifier : Verify Signature
Verifier->>Verifier : Check Revocation Status
Verifier-->>Server : Validation Result
Server-->>Client : Response with Metadata Info
Diagram sources
- mds3.py
Common MDS Issues:
- Network Connectivity: Unable to reach MDS endpoints
- Certificate Validation: SSL/TLS certificate issues
- Revocation Checking: Device revocation status failures
- Metadata Parsing: Corrupted or malformed metadata
Diagnostic Commands:
# Test MDS connectivity
curl -I https://mds3.fidoalliance.org/
# Check certificate chain
openssl s_client -connect mds3.fidoalliance.org:443 -showcerts
# Test metadata download
curl https://mds3.fidoalliance.org/Solutions:
- Network Issues:
# Configure proxy if needed
export HTTPS_PROXY=http://proxy.company.com:8080
export HTTP_PROXY=http://proxy.company.com:8080- Certificate Issues:
# Add custom CA certificates
import ssl
ssl_context = ssl.create_default_context()
ssl_context.load_verify_locations('/path/to/custom/ca-bundle.crt')Section sources
- mds3.py
- config.py
Problem: X.509 certificate validation errors in attestation chains.
Validation Process:
flowchart TD
Start([Attestation Received]) --> ParseCert["Parse Certificate Chain"]
ParseCert --> CheckFormat{"Valid Format?"}
CheckFormat --> |No| FormatError["Certificate Format Error"]
CheckFormat --> |Yes| VerifyChain["Verify Certificate Chain"]
VerifyChain --> CheckDates{"Certificates Valid?"}
CheckDates --> |No| DateError["Certificate Expired"]
CheckDates --> |Yes| CheckRoot{"Trusted Root?"}
CheckRoot --> |No| TrustError["Untrusted Root"]
CheckRoot --> |Yes| CheckRevocation["Check Revocation"]
CheckRevocation --> Revoked{"Revoked?"}
Revoked --> |Yes| RevocationError["Device Revoked"]
Revoked --> |No| Success["Validation Successful"]
FormatError --> End([Validation Failed])
DateError --> End
TrustError --> End
RevocationError --> End
Success --> End
Diagram sources
- attestation.py
Common Certificate Issues:
- Expired certificates
- Untrusted root certificates
- Certificate chain validation failures
- Revoked device certificates
Diagnostic Tools:
# Certificate validation debugging
from cryptography.x509 import load_pem_x509_certificate
from cryptography.hazmat.backends import default_backend
def debug_certificate(cert_pem):
cert = load_pem_x509_certificate(cert_pem.encode(), default_backend())
print(f"Issuer: {cert.issuer}")
print(f"Subject: {cert.subject}")
print(f"Not Valid Before: {cert.not_valid_before}")
print(f"Not Valid After: {cert.not_valid_after}")
print(f"Serial Number: {cert.serial_number}")Section sources
- attestation.py
Problem: Credential storage failures in local filesystem.
Common Issues:
- Disk space exhaustion
- Permission denied errors
- File corruption
- Concurrent access conflicts
Diagnostic Commands:
# Check disk space
df -h server/server/session-credentials/
# Verify permissions
ls -la server/server/session-credentials/
# Test file operations
python -c "
import os
import tempfile
temp_dir = tempfile.mkdtemp()
test_file = os.path.join(temp_dir, 'test')
with open(test_file, 'w') as f:
f.write('test')
os.remove(test_file)
os.rmdir(temp_dir)
print('File operations successful')
"Solutions:
- Disk Space Management:
# Clean up old credentials
find server/server/session-credentials/ -name "*.pkl" -mtime +30 -delete
# Monitor disk usage
du -sh server/server/session-credentials/- Permission Fixes:
# Fix ownership
sudo chown -R www-data:www-data server/server/session-credentials/
# Fix permissions
sudo chmod -R 755 server/server/session-credentials/Section sources
- storage.py
Problem: Google Cloud Storage connectivity and authentication issues.
Common Issues:
- Service account authentication failures
- Bucket access permission errors
- Network connectivity problems
- Rate limiting and quota exceeded
Authentication Diagnostics:
# Test GCS authentication
gcloud auth list
gcloud config get-value project
# Test bucket access
gsutil ls gs://your-bucket-name/Configuration Validation:
# Cloud storage configuration test
from server.server.cloud_storage import _build_client, _ensure_bucket
try:
client = _build_client()
bucket = _ensure_bucket()
print(f"Successfully connected to bucket: {bucket.name}")
except Exception as e:
print(f"Cloud storage configuration error: {e}")Solutions:
- Service Account Setup:
# Download service account key
gcloud iam service-accounts keys create key.json \
--iam-account=your-service-account@your-project.iam.gserviceaccount.com
# Set environment variables
export GOOGLE_APPLICATION_CREDENTIALS=$(pwd)/key.json
export FIDO_SERVER_GCS_BUCKET=your-bucket-name- Network Configuration:
# Configure firewall rules
gcloud compute firewall-rules create allow-gcs-access \
--allow tcp:443 \
--source-ranges 0.0.0.0/0 \
--target-tags gcs-accessSection sources
- cloud_storage.py
Common WebAuthn Errors:
| Error Category | HTTP Status | Description | Resolution |
|---|---|---|---|
| Invalid Request | 400 | Malformed request parameters | Validate input format |
| Unauthorized | 401 | Missing or invalid authentication | Check authentication tokens |
| Forbidden | 403 | Insufficient permissions | Verify user permissions |
| Not Found | 404 | Resource not found | Check resource existence |
| Conflict | 409 | Resource conflict | Resolve conflicting operations |
| Internal Error | 500 | Server-side error | Check server logs |
CTAP2 Command Status Codes:
| Code | Name | Description | Action |
|---|---|---|---|
| 0x00 | SUCCESS | Operation completed successfully | Continue with next step |
| 0x01 | INVALID_COMMAND | Unsupported or invalid command | Check command specification |
| 0x02 | INVALID_PARAMETER | Invalid parameter value | Validate parameter constraints |
| 0x03 | INVALID_LENGTH | Data length exceeds limits | Check data size limits |
| 0x05 | TIMEOUT | Operation timed out | Increase timeout or retry |
| 0x21 | PROCESSING | Device is processing request | Wait for completion |
| 0x2F | USER_ACTION_TIMEOUT | User action timeout | Reduce timeout values |
| 0x31 | PIN_INVALID | PIN verification failed | Reset or correct PIN |
| 0x35 | PIN_NOT_SET | PIN not configured | Set PIN before use |
| 0x7F | OTHER | Other unspecified error | Check device logs |
Error Code Translation:
def translate_ctap_error(code):
error_map = {
0x00: "Success",
0x01: "Invalid Command",
0x02: "Invalid Parameter",
0x03: "Invalid Length",
0x05: "Timeout",
0x21: "Processing",
0x2F: "User Action Timeout",
0x31: "PIN Invalid",
0x35: "PIN Not Set",
0x7F: "Other Error"
}
return error_map.get(code, f"Unknown Error (0x{code:02X})")Section sources
- ctap.py
Log Configuration: The platform uses Flask's built-in logging with configurable levels.
Log Locations:
- Application logs: Standard output/stderr
- Access logs: Flask development server
- Error logs: Python exception traces
Log Analysis Commands:
# Tail application logs
tail -f /var/log/webauthn-server.log
# Filter error logs
grep -i error /var/log/webauthn-server.log
# Search for specific issues
grep -i "device.*not.*found" /var/log/webauthn-server.log
grep -i "metadata.*error" /var/log/webauthn-server.logStructured Logging Example:
# Enhanced logging with context
import logging
from flask import request
logger = logging.getLogger(__name__)
@app.before_request
def log_request_info():
logger.debug('Headers: %s', request.headers)
logger.debug('Body: %s', request.get_data())
@app.errorhandler(Exception)
def handle_exception(e):
logger.error('Unhandled exception: %s', str(e), exc_info=True)
return {'error': 'Internal server error'}, 500Section sources
- app.py
HID Layer Debugging:
# Enable HID debugging
import logging
logging.getLogger('fido2.hid').setLevel(logging.DEBUG)
# Trace device communication
from fido2.hid import CtapHidDevice
def debug_device_communication():
devices = list(CtapHidDevice.list_devices())
for dev in devices:
print(f"Device: {dev.descriptor}")
try:
# Send ping command
dev.call(0x01, b'\x00' * 8)
print("Ping successful")
except Exception as e:
print(f"Ping failed: {e}")Packet Capture:
# Capture USB traffic (requires appropriate drivers)
usbmon -t
# Monitor HID events
hidlisten -vSection sources
- base.py
Metadata Validation Tools:
# Metadata service health check
from server.server.metadata import ensure_metadata_bootstrapped
def check_metadata_health():
try:
ensure_metadata_bootstrapped()
print("Metadata service healthy")
except Exception as e:
print(f"Metadata service error: {e}")
# Certificate chain validation
from server.server.attestation import verify_attestation_chain
def validate_attestation_chain(attestation_data):
try:
result = verify_attestation_chain(attestation_data)
if result['root_valid']:
print("Certificate chain valid")
else:
print(f"Certificate chain invalid: {result['errors']}")
except Exception as e:
print(f"Chain validation error: {e}")Section sources
- mds3.py
Problem: Slow server startup or dependency loading.
Startup Process:
flowchart TD
Start([Server Start]) --> LoadConfig["Load Configuration"]
LoadConfig --> WarmDeps["Warm Dependencies"]
WarmDeps --> LoadMetadata["Load Metadata"]
LoadMetadata --> LoadStorage["Initialize Storage"]
LoadStorage --> LoadDevices["Detect Devices"]
LoadDevices --> Ready["Server Ready"]
WarmDeps --> MetadataCheck{"Metadata Available?"}
MetadataCheck --> |No| RetryMetadata["Retry Metadata"]
MetadataCheck --> |Yes| StorageCheck{"Storage Ready?"}
RetryMetadata --> MetadataCheck
StorageCheck --> |No| RetryStorage["Retry Storage"]
StorageCheck --> |Yes| DeviceCheck{"Devices Found?"}
RetryStorage --> StorageCheck
DeviceCheck --> |No| RetryDevices["Retry Devices"]
DeviceCheck --> |Yes| Ready
RetryDevices --> DeviceCheck
Diagram sources
- startup.py
Performance Optimization:
# Startup performance monitoring
import time
from server.server.startup import warm_up_dependencies
def monitor_startup_performance():
start_time = time.time()
try:
warm_up_dependencies()
elapsed = time.time() - start_time
print(f"Startup completed in {elapsed:.2f} seconds")
except Exception as e:
elapsed = time.time() - start_time
print(f"Startup failed after {elapsed:.2f} seconds: {e}")
# Parallel dependency loading
import concurrent.futures
def parallel_warmup():
with concurrent.futures.ThreadPoolExecutor() as executor:
futures = [
executor.submit(lambda: ensure_metadata_bootstrapped()),
executor.submit(lambda: cloud_storage.ensure_ready()),
executor.submit(lambda: session_metadata_store.ensure_session("__startup__")),
]
for future in concurrent.futures.as_completed(futures):
try:
future.result()
except Exception as e:
logger.warning(f"Parallel warmup failed: {e}")Section sources
- startup.py
Resource Monitoring:
# Monitor memory usage
ps aux | grep python
top -p $(pgrep -f "python.*webauthn")
# Check disk usage
du -sh server/server/session-credentials/
du -sh server/server/static/
# Monitor network connections
netstat -tulpn | grep :5000
lsof -i :5000Memory Optimization:
# Memory profiling
import psutil
import gc
def monitor_memory_usage():
process = psutil.Process()
mem_info = process.memory_info()
print(f"Memory usage: {mem_info.rss / 1024 / 1024:.2f} MB")
# Force garbage collection
gc.collect()
mem_info = process.memory_info()
print(f"After GC: {mem_info.rss / 1024 / 1024:.2f} MB")Primary Resources:
- FIDO Alliance Specifications: https://fidoalliance.org/specifications/
- WebAuthn Explained: https://webauthn.io/
- Post-Quantum Cryptography: https://csrc.nist.gov/projects/post-quantum-cryptography
Platform-Specific Documentation:
- liboqs Documentation: https://liboqs.org/
- Google Cloud Storage: https://cloud.google.com/storage/docs
- Flask Framework: https://flask.palletsprojects.com/
Getting Help:
- GitHub Issues: Report bugs and feature requests
- Stack Overflow: Tag with "post-quantum-webauthn"
- Discord Channels: Join community discussions
- Mailing Lists: Subscribe to development updates
Bug Reporting Guidelines:
## Bug Report Template
**Environment**:
- OS: [e.g., Ubuntu 22.04]
- Python Version: [e.g., 3.9.7]
- liboqs Version: [e.g., 0.14.1]
- Browser: [e.g., Chrome 115]
**Steps to Reproduce**:
1. [First step]
2. [Second step]
3. [Third step]
**Expected Behavior**:
[Description of expected behavior]
**Actual Behavior**:
[Description of actual behavior]
**Logs**:[Paste relevant log output here]
**Additional Context**:
[Any additional information that might help diagnose the issue]
Issue Severity Levels:
| Severity | Description | Response Time | Escalation Path |
|---|---|---|---|
| Critical | System down, data loss | 1 hour | Immediate team contact |
| High | Major functionality broken | 4 hours | Team lead notification |
| Medium | Feature degradation | 24 hours | Regular support channels |
| Low | Minor issues, enhancements | 1 week | Community forum |
Escalation Procedures:
- Self-Diagnosis: Attempt to resolve using documentation
- Community Support: Seek help from community channels
- Official Support: Contact vendor support for critical issues
- Security Issues: Report immediately to security team
Contact Information:
- Security Issues: security@domain.com
- General Support: support@domain.com
- Sales Inquiries: sales@domain.com
Development Setup:
# Clone repository
git clone https://github.com/rainzhang05/postquantum-webauthn-platform.git
cd postquantum-webauthn-platform
# Set up virtual environment
python -m venv venv
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
pip install -e .
# Run tests
pytest tests/Development Guidelines:
- Follow PEP 8 coding standards
- Write comprehensive tests
- Update documentation
- Submit pull requests with clear descriptions
Section sources
- requirements.txt
- test_storage.py