You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm working with a distributed JanusGraph architecture deployed on Azure Kubernetes Service (AKS):
Infrastructure:
AKS Cluster: 2 nodes (16 vCPU, 64 GB RAM each)
Cassandra: 2 replicas with sharding enabled (Kubernetes pods)
Elasticsearch: 2 replicas with sharding enabled (Kubernetes pods)
JanusGraph: Single replica connected to both backends (Kubernetes pod)
Mixed index: Created on title and nt property keys
Connection Pool Implementation:
I've implemented thread-safe connection pooling where each keyspace has its cached traversal:
class BaseGremlinClass(View):
_connections = {}
_lock = threading.Lock()
def get_traversal(self, keyspace_name):
"""Get or create a traversal for the given keyspace"""
if keyspace_name not in settings.JANUSGRAPH_KEYSPACES:
raise ValueError(f"Keyspace {keyspace_name} not found in settings")
with self._lock:
if keyspace_name not in self._connections:
self._create_connection(keyspace_name)
print("Getting Connection from Pool")
return self._connections[keyspace_name]['traversal']
def _create_connection(self, keyspace_name):
"""Create a new connection and traversal"""
try:
config = settings.JANUSGRAPH_KEYSPACES[keyspace_name]
connection = DriverRemoteConnection(
config['url'],
config['graph'],
message_serializer=serializer.GraphSONSerializersV3d0(),
timeout=30,
pool_size=10,
max_workers=4,
)
traversal_g = traversal().withRemote(connection)
self._connections[keyspace_name] = {
'connection': connection,
'traversal': traversal_g
}
logger.info(f"Created connection for keyspace {keyspace_name}")
except Exception as e:
logger.error(f"Error creating connection to {keyspace_name}: {e}")
raise
class GremlinQueries:
def __init__(self, keyspace_name='main'):
traversal = BaseGremlinClass()
self.g = traversal.get_traversal(keyspace_name).with_('evaluationTimeout', 0)
self.keyspace_name = keyspace_name
def get_all_nodes_label(self):
"""Return list of node types"""
data = self.g.V().has('nt').values('nt').dedup().order().by(Order.asc).to_list()
return data
Performance Issue
Despite having connection pooling, indexing, and sharding implemented, I'm observing:
First query execution: Takes significantly longer (e.g., 45 seconds)
Second query: Runs in almost half the time (~20-25 seconds)
Subsequent queries: Maintain the improved performance (i.e. maintains the performance of the 2nd run)
Questions
Why does the first Gremlin query take significantly longer than subsequent runs in a Kubernetes environment, even with connection pooling and indexing?
What Kubernetes-specific factors might be contributing to this cold start behavior?
What optimisations can be implemented to reduce the first-time latency in a containerised distributed setup?
Are there specific considerations for sharded Cassandra/Elasticsearch deployments on Kubernetes that could impact initial query performance?
What I've Tried
Verified that connection pooling is working (connections are reused)
Confirmed mixed indexes are properly created and being used
Checked that subsequent queries with same/different parameters show improved performance
Monitored that the connection pool prevents reconnection overhead
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Environment Setup
I'm working with a distributed JanusGraph architecture deployed on Azure Kubernetes Service (AKS):
Infrastructure:
title
andnt
property keysConnection Pool Implementation:
I've implemented thread-safe connection pooling where each keyspace has its cached traversal:
Performance Issue
Despite having connection pooling, indexing, and sharding implemented, I'm observing:
Questions
What I've Tried
Beta Was this translation helpful? Give feedback.
All reactions