Skip to content

Commit 9530323

Browse files
committed
Added changes to fix pod crash after deployment
1 parent 290491f commit 9530323

File tree

5 files changed

+249
-2
lines changed

5 files changed

+249
-2
lines changed

.github/workflows/deploy-azure.yml

Lines changed: 63 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -107,17 +107,78 @@ jobs:
107107
done
108108
fi
109109
110+
# Create a credentials hash to force pod restart when credentials change using maximum security approach
111+
# Create a random file name to prevent predictable access
112+
TEMP_FILE=$(mktemp)
113+
114+
# Write credentials directly to file descriptor to prevent command line visibility
115+
{
116+
printf "%s" "${{ secrets.MEMGRAPH_USERNAME }}"
117+
printf "%s" "${{ secrets.MEMGRAPH_PASSWORD }}"
118+
} > "$TEMP_FILE"
119+
120+
# Generate hash and immediately remove the file
121+
CREDENTIALS_HASH=$(sha256sum "$TEMP_FILE" | awk '{print $1}')
122+
rm -f "$TEMP_FILE"
123+
124+
# Store hash in environment variable for later use
125+
echo "CREDENTIALS_HASH=$CREDENTIALS_HASH" >> $GITHUB_ENV
126+
127+
# Apply the kubernetes secret with the new credentials
110128
kubectl create secret generic memgraph-credentials \
111129
--from-literal=username=${{ secrets.MEMGRAPH_USERNAME }} \
112130
--from-literal=password=${{ secrets.MEMGRAPH_PASSWORD }} \
113131
--dry-run=client -o yaml | kubectl apply -f -
114132
115133
- name: Deploy to AKS
116134
run: |
117-
kubectl apply -f infra/k8s/memgraph.yaml
135+
# Replace the placeholder with the actual credentials hash
136+
cat infra/k8s/memgraph.yaml | CREDENTIALS_HASH=${CREDENTIALS_HASH} envsubst > memgraph_deploy.yaml
137+
138+
# Apply the updated deployment manifest
139+
kubectl apply -f memgraph_deploy.yaml
140+
141+
# Force restart if the deployment already exists
142+
POD_NAME=$(kubectl get pods -l app=memgraph -o jsonpath="{.items[0].metadata.name}" 2>/dev/null || echo "")
143+
if [[ -n "$POD_NAME" ]]; then
144+
echo "Forcing restart of existing Memgraph pod..."
145+
kubectl delete pod $POD_NAME
146+
fi
147+
148+
# Remove the temporary file
149+
rm memgraph_deploy.yaml
118150
119151
- name: Verify Deployment
120152
run: |
153+
echo "Checking deployment status..."
121154
kubectl get pods
122155
kubectl get services
123-
kubectl wait --for=condition=ready pod -l app=memgraph --timeout=5m
156+
157+
echo "Waiting for Memgraph pod to be ready (may take up to 5 minutes)..."
158+
if ! kubectl wait --for=condition=ready pod -l app=memgraph --timeout=5m; then
159+
echo "Error: Memgraph pod did not become ready within the timeout period."
160+
echo "Checking Memgraph pod logs:"
161+
POD_NAME=$(kubectl get pods -l app=memgraph -o jsonpath="{.items[0].metadata.name}")
162+
kubectl logs $POD_NAME
163+
kubectl describe pod $POD_NAME
164+
echo "Deployment verification failed!"
165+
exit 1
166+
fi
167+
168+
echo "Memgraph deployment successful!"
169+
170+
# Wait for the LoadBalancer service to get an external IP
171+
echo "Waiting for LoadBalancer to get external IP..."
172+
for i in {1..30}; do
173+
EXTERNAL_IP=$(kubectl get service memgraph -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
174+
if [[ -n "$EXTERNAL_IP" ]]; then
175+
echo "Memgraph is accessible at: ${EXTERNAL_IP}:7687"
176+
break
177+
fi
178+
echo "Waiting for external IP (attempt $i)..."
179+
sleep 10
180+
done
181+
182+
if [[ -z "$EXTERNAL_IP" ]]; then
183+
echo "Warning: Could not obtain external IP for Memgraph service within timeout."
184+
fi

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,8 @@ To deploy to Azure:
3434
3. Add `MEMGRAPH_USERNAME` and `MEMGRAPH_PASSWORD` to GitHub secrets
3535
4. Push to main branch to trigger deployment or manually trigger the workflow
3636

37+
If you encounter issues connecting to Memgraph after deployment, see [Memgraph Troubleshooting Guide](docs/memgraph-troubleshooting.md).
38+
3739
## Architecture
3840
The project follows a component-based architecture where the AI Agent orchestrates interactions between users, language models, local tools, and MCP servers.
3941

docs/memgraph-troubleshooting.md

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
# Troubleshooting Memgraph Connections
2+
3+
This guide provides troubleshooting steps for common issues when connecting to Memgraph.
4+
5+
## Connection Issues
6+
7+
If you're experiencing issues connecting to the Memgraph database deployed on Azure, follow these steps:
8+
9+
### 1. Check if the Memgraph service is running
10+
11+
```bash
12+
kubectl get pods -l app=memgraph
13+
```
14+
15+
The output should show a pod in the `Running` state. If not, check the pod status:
16+
17+
```bash
18+
kubectl describe pod -l app=memgraph
19+
```
20+
21+
### 2. Check the Memgraph logs for errors
22+
23+
```bash
24+
kubectl logs -l app=memgraph
25+
```
26+
27+
Common errors include:
28+
- VM memory map count issues
29+
- Authentication failures
30+
- Networking problems
31+
32+
### 3. Verify credential secret exists
33+
34+
```bash
35+
kubectl get secret memgraph-credentials
36+
```
37+
38+
### 4. Run the diagnostic script
39+
40+
We provide a comprehensive diagnostic script that checks for common issues and attempts to fix them:
41+
42+
```bash
43+
./scripts/memgraph_diagnostics.sh
44+
```
45+
46+
This script will:
47+
- Check pod status
48+
- Verify service configuration
49+
- Validate credentials
50+
- Look for common errors in logs
51+
- Apply fixes when possible
52+
- Test connectivity
53+
54+
### 5. Manual fix for credential issues
55+
56+
If you've changed the Memgraph credentials in GitHub secrets but the pod doesn't reflect the changes:
57+
58+
1. Update the Kubernetes secret:
59+
```bash
60+
kubectl create secret generic memgraph-credentials \
61+
--from-literal=username=YOUR_USERNAME \
62+
--from-literal=password=YOUR_PASSWORD \
63+
--dry-run=client -o yaml | kubectl apply -f -
64+
```
65+
66+
2. Force the pod to restart with the new credentials:
67+
```bash
68+
kubectl patch deployment memgraph -p \
69+
"{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"restart-at\":\"$(date +%s)\"}}}}}"
70+
```
71+
72+
3. Wait for the new pod to be ready:
73+
```bash
74+
kubectl wait --for=condition=ready pod -l app=memgraph --timeout=5m
75+
```
76+
77+
### 6. Testing connectivity
78+
79+
You can test connectivity using the provided test script:
80+
81+
```bash
82+
# Make sure your .env file contains the correct credentials
83+
python examples/10-azure-memgraph-test.py
84+
```
85+
86+
## Common Errors and Solutions
87+
88+
### VM Max Map Count Error
89+
90+
If you see an error like "Max virtual memory areas vm.max_map_count 65530 is too low", this is fixed in the latest deployment configuration with an init container.
91+
92+
### Authentication Errors
93+
94+
If you're getting authentication errors:
95+
1. Verify the credentials in your `.env` file match those in the Kubernetes secret
96+
2. Ensure the GitHub secrets have been properly updated
97+
3. Check if the deployment was updated after changing the credentials
98+
99+
### Connection Timeouts
100+
101+
If connections are timing out:
102+
1. Verify the Azure Network Security Group allows traffic on port 7687
103+
2. Check if the service has a valid external IP address
104+
3. Ensure your client can reach the Azure VM (no firewall blocking access)
105+
106+
## Need Further Help?
107+
108+
If the above steps don't resolve your issue, please:
109+
1. Gather the output from the diagnostic script
110+
2. Collect all relevant error messages
111+
3. Open an issue providing these details

examples/10-azure-memgraph-test.py

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
from dotenv import load_dotenv
2+
load_dotenv(override=True)
3+
4+
import os
5+
import mgclient
6+
import sys
7+
import time
8+
9+
def test_memgraph_connection(max_retries=3, retry_delay=5) -> bool:
10+
host = os.environ.get("MEMGRAPH_URI", "127.0.0.1")
11+
port = int(os.environ.get("MEMGRAPH_PORT", "7687"))
12+
username = os.environ.get("MEMGRAPH_USERNAME", "memgraph")
13+
password = os.environ.get("MEMGRAPH_PASSWORD", "memgraph")
14+
15+
print(f"Testing connection to Memgraph at {host}:{port} with username: {username}")
16+
17+
# Convert localhost to IP if needed
18+
if host == "localhost":
19+
host = "127.0.0.1"
20+
21+
for attempt in range(1, max_retries + 1):
22+
try:
23+
print(f"Attempt {attempt}/{max_retries}: Connecting to Memgraph at {host}:{port}")
24+
conn = mgclient.connect(
25+
host=host,
26+
port=port,
27+
username=username,
28+
password=password,
29+
connect_timeout_ms=10000 # 10 second timeout
30+
)
31+
32+
conn.autocommit = True
33+
cursor = conn.cursor()
34+
35+
# Test a basic query
36+
cursor.execute("RETURN 'Connection successful!' AS message")
37+
result = cursor.fetchone()[0]
38+
print(f"Success: {result}")
39+
40+
# Additional test query to verify data manipulation works
41+
print("Testing data manipulation...")
42+
cursor.execute("CREATE (n:TestNode {name: 'connection_test'}) RETURN n")
43+
cursor.execute("MATCH (n:TestNode {name: 'connection_test'}) DELETE n")
44+
print("Data manipulation successful")
45+
46+
cursor.close()
47+
conn.close()
48+
return True
49+
except Exception as e:
50+
print(f"Connection attempt {attempt} failed: {str(e)}")
51+
if attempt < max_retries:
52+
print(f"Retrying in {retry_delay} seconds...")
53+
time.sleep(retry_delay)
54+
else:
55+
print("All connection attempts failed")
56+
return False
57+
58+
if __name__ == "__main__":
59+
success = test_memgraph_connection()
60+
if not success:
61+
print("Failed to connect to Memgraph")
62+
sys.exit(1)
63+
print("Successfully connected to Memgraph")
64+
sys.exit(0)

infra/k8s/memgraph.yaml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,16 @@ spec:
1818
metadata:
1919
labels:
2020
app: memgraph
21+
annotations:
22+
# This annotation will be updated whenever we want to force a pod restart
23+
credentials-hash: "${CREDENTIALS_HASH}"
2124
spec:
25+
initContainers:
26+
- name: init-sysctl
27+
image: busybox:1.28
28+
command: ["sysctl", "-w", "vm.max_map_count=262144"]
29+
securityContext:
30+
privileged: true
2231
containers:
2332
- name: memgraph
2433
image: memgraph/memgraph-mage:latest

0 commit comments

Comments
 (0)