talks

A collection of my technical talks across conferences and meetups, slide decks and recordings.

#	Date	Event	Title	Slide Deck	Recording	Description
1	2024-07-06	Bengaluru Streams Meetup	Batch to Near-Realtime: Inspired by a Real Production Incident	View Slides	Watch Recording	Insights into transitioning from batch to real-time processing.
2	2024-02-03	MyDBOps Open Source Database Meetup	Navigating Transactions: ACID Complexity in Modern Databases	View Slides	Watch Recording	Understanding ACID properties in contemporary databases.
3	2023-12-06	Druid Summit 2023	Changing Druid Ingestion from 3 Hours to 5 Minutes	View Slides	Watch Recording	Optimizing Druid ingestion processes.
4		Pulsar Summit Asia 2022	Streaming Wars: How Apache Pulsar is Acing the Battle	View Slides	Watch Recording	Exploring Apache Pulsar's role in the streaming ecosystem.
5		Pulsar Summit Asia 2021	Designing Pulsar for Isolation	View Slides	Watch Recording	Strategies for isolating workloads in Apache Pulsar.
6		ApacheCon 2021	Structured Data Streaming with Apache Pulsar	View Slides	Watch Recording	Leveraging Apache Pulsar for structured data streaming.
7		ApacheCon 2021	Apache BookKeeper Key-Value Store and Use Cases	View Slides	Watch Recording	Insights into Apache BookKeeper's key-value store capabilities.
8		Pulsar NA Summit 2021	How Pulsar Stores Data	View Slides	Watch Recording	Understanding Apache Pulsar's data storage mechanisms.
9		Pulsar Summit Asia	Running a Secure Pulsar Cluster	View Slides	Watch Recording	Best practices for securing Apache Pulsar deployments.
10		Pulsar Summit Asia	Lessons from Managing a Pulsar Cluster	View Slides	Watch Recording	Experiences and lessons learned from managing Apache Pulsar clusters.
11		FOSSASIA 2015	MySQL Group Replication	View Slides	N/A	Deep dive into MySQL's group replication features.
12		Open Source India 2014	MySQL High Availability with Replication New Features	View Slides	N/A	Exploring new features in MySQL replication for high availability.
13		MySQL Developer Day Conference	MySQL Replication and Scalability	View Slides	N/A	Strategies for scaling MySQL using replication techniques.
14		MySQL User Camp	GTIDs in MySQL	View Slides	N/A	Understanding Global Transaction Identifiers in MySQL replication.
15	2023-10-20	Open Source India, 2023	Building Blocks of Open Source Databases	View Slides	Watch Recording	Building Blocks of Open Source Databases
16		1st Apache Druid Meetup, Bangalore	Apache Druid on kubernetes	View Slides	Watch Recording	Druid on kubernetes
17		Bengaluru Streams Meetup	Apache Pulsar - The anatomy of Pub & Sub	View Slides	Watch Recording	Apache Pulsar - The anatomy of Pub & Sub
18		Pulsar Summit Asia 2022	Keeping on top of hybrid cloud usage with Pulsar	View Slides	Watch Recording	Keeping on top of hybrid cloud usage with Pulsar
19	2024-11-21	Open Source Analytics Conference	Unified Data Management with ClickHouse® and Postgres	View Slides	Watch Recording	Unified Data Management with ClickHouse® and Postgres
20	2021-10-08	EventSourcing Live 2021	Streaming app changes to Event Store	View Slides	Watch Recording	Streaming Event Changes to App Via Events or CDC, tradeoff and challenges
21	2023-06-14	1st Apache Pulsar India User Meet	Apache Pulsar Design Choices & use-cases	View Slides	Watch Recording	Design Choices to love in Pulsar Asrchitecture and the Trade-Offs
22	2024-05-24	SNIA Webinar	Navigating Transactions: ACID Complexity in Modern Databases	View Slides	Watch Recording	Understanding ACID properties in contemporary databases.
23	2024-03-23	Clickhouse India Meetup	Clickhouse Bangalore meetup: Doors Open & Community syncup	N/A	Watch Recording	Clickhouse Community Usage Stories and Questionaire
24	2024-04-15	Clickhouse India Webinar	Live Q&A Forum with Database & Cloud Experts	N/A	Watch Recording	Panelist in Clickhouse Live Webinar hosted by Clickhouse Inc for questions left from #24
25	2025-03-06	PGConf India 2025	Pushing PostgreSQL to the Limits: Tackling Analytics Workloads with Extensions	View Slides	Watch Recording	Run OLAP benchmarks on postgres, find issues & ideate on how to fix them. Read Abstract for more details
26	2025-05-10	Lakehouse Days Bengaluru	Hacking Iceberg on Your Existing Databases	View Slides	Watch Recording	Hacking Clickhouse & Postgres Open source code to support Iceberg Table Format
27	2025-06-27	Clickhouse Bangalore Meetup	Squeezing Performance: Clickhouse@4GB on K8s	View Slides	N/A	Benchmarking ClickHouse on Low-Memory Kubernetes Environments
28	2025-07-12	Clickhouse Mumbai Meetup	Rebalncing shards in Clickhouse open source	View Slides	N/A	Clickhouse doesn't rebalance shards when a new shard is added. Presented options, open proposals and how we solved it
29	2025-08-07	Kubecon + CLoudNative Con India 2025	Bridging Big Data and Machine Learning ecosystems : A cloudNative Approach using Kubeflow	View Slides	Watch Recording	In today's data-driven landscape, bridging the gap between scalable big data systems (e.g., Apache Spark, Iceberg) and machine learning frameworks (e.g., PyTorch) while minimizing data movement and serialization overhead is a critical challenge. Traditional workflows require costly data serialization between storage (e.g., Parquet/Iceberg) and training frameworks, creating bottlenecks leading to inefficient resource utilization in distributed training. This talk explores a cloud-native solution using Kubeflow for end-to-end ML orchestration and Apache Arrow for high-performance data interchange, enabling seamless integration of analytics and ML workflows.
30	2025-11-05	Open Source Analytics Conference 2025	ClickHouse® Chronicles: Real-World War Rooms with Human and AI Agents	View Slides	Watch Recording	We walk you through some of the toughest incidents we’ve faced: what broke, what we thought was wrong, what actually was wrong, and how we got to the root cause. Along the way, we’ll introduce a practical framework for tackling such issues—combining human intuition, AI assistance, and the messy negotiations that often define real-world problem-solving

Blog Posts

Open Source Work — Databases, Messaging, and Observability

I’ve had 70+ changes merged upstream across major open-source systems including MySQL, Apache Pulsar, and ClickHouse. What follows is a curated set of examples that reflect the kinds of problems I’ve worked on and the impact of that work.

MySQL (Oracle) — Replication & Binlog work

Between 2012–2013, I worked on MySQL Server at Oracle, primarily in the Replication and Binlog subsystems. I had 50+ changes merged upstream during this period; the items below are a small, representative subset that highlights the kinds of problems I worked on and the impact of that work.

Features / Enhancements

Replication observability via Performance Schema (SHOW SLAVE STATUS)
This work moved replication state from ad-hoc text output into structured Performance Schema tables, making it possible to monitor and reason about replication programmatically.
WorkLog: WL#3656
Key commits:
Improved replication control and operational workflows
Focused on making replication management less disruptive, particularly around master changes and GTID-based setups.
WorkLog: WL#6120
Key commits:

Bug fixes and correctness work

Fixed a worker ID mismatch across replication metadata tables
Addressed inconsistencies between SLAVE_WORKER_INFO and REPLICATION_EXECUTE_STATUS_BY_WORKER.
Commit
Stabilized replication tests during InnoDB crash recovery
Fixed sporadic MTR failures that appeared under crash-recovery scenarios.
Commit
Corrected RESET SLAVE ALL behavior
Ensured all connection parameters in MASTER_INFO are properly reset.
Commit
Fixed incorrect thread ID reporting in replication Performance Schema tables
Aligned replication P_S tables with internal PFS thread identifiers.
Commit
Prevented crashes and invalid data in replication worker status tables
Fixed crashes and garbage values in REPLICATION_EXECUTE_STATUS_BY_WORKER.
Commit
Hardened replication Performance Schema queries
Prevented server crashes when querying replication P_S tables without replication configured.
Commit
Improved mysqlbinlog correctness and usability
- Reset byte position counters correctly when switching binlog files
  Commit
- Made PURGE BINARY LOGS behavior more reliable and informative
  Commit
- Fixed incorrect data-type decoding and DECIMAL handling in verbose output
  Commit · Commit
Reduced flakiness in replication test suite Fixed intermittent failures in rpl_row_until and related tests on PB2.
Commit · Commit
GTID correctness improvements
- Normalized GTID UUID values for consistency
  Commit
- Fixed SQL_SLAVE_SKIP_COUNTER behavior with GTID_MODE=ON
  Commit

Apache Pulsar — Client, Schema, and Connector work

I’ve contributed to Apache Pulsar across the Java and Python clients, schema handling, and the Pulsar–Flink connector. Most of my work focused on smoothing real-world operational edges: configuration correctness, authentication, schema behavior, and making failures easier to reason about.

Selected contributions

Made schema version information available in the Python client
Exposed the writer schema version on messages so applications can reason about schema evolution at runtime instead of relying on external metadata.
PR #8173
Improved schema handling around incompatible schemas
Enabled incompatible schemas to safely co-exist on a topic, unblocking certain migration and multi-producer use cases.
PR #3840
Hardened Avro encoding failure paths
Fixed an edge case where Avro encoding failures could leave cursors in an inconsistent state, leading to hard-to-debug downstream issues.
PR #6695
Simplified TLS configuration in the Java client
Made TLS usage derive automatically from the service URL protocol instead of requiring explicit flags, reducing configuration foot-guns.
PR #4451
Improved authentication handling in the Java client
Cleaned up how authentication is constructed from class names and parameters, making client configuration more consistent and less error-prone.
PR #4381
Fixed and extended authentication support in the Pulsar–Flink connector
Ensured authentication works correctly when building Flink sources and allowed client auth to be configured directly from the connector layer.
PR #4284 · PR #3949
Moved Pulsar–Flink configuration to typed POJOs
Replaced loosely-typed configuration maps with explicit config objects, improving validation, readability, and long-term maintainability.
PR #4232
Improved error reporting for invalid client configuration
Added explicit errors for out-of-range or invalid configuration values, reducing silent misconfigurations.
PR #3950
Clarified and documented schema auto-update behavior
Documented the default schema auto-update strategy (FULL), reducing surprises when teams first adopt schemas in Pulsar.
PR #3842
Filled gaps in CLI and user documentation
Added missing documentation for publish-rate related CLI commands and improved general project documentation.
PR #6890 · PR #3841

ClickHouse — Observability and performance metrics

Added detailed mark cache eviction metrics (evicted bytes, marks, and files)
For long-running analytical workloads, understanding cache behavior is essential.
This contribution enhanced ClickHouse internals to expose evicted mark cache statistics, making it easier to reason about cache pressure and performance regressions.
Merged Pull Request: ClickHouse/ClickHouse#80799
(Fixes ClickHouse/ClickHouse#60989)

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
1 Druid on Kubernetes by Shivji Kumar Jha and Dinesh Pundkar, Nutanix.pptx		1 Druid on Kubernetes by Shivji Kumar Jha and Dinesh Pundkar, Nutanix.pptx
2. Pulsar_ what we love and design patterns.pptx		2. Pulsar_ what we love and design patterns.pptx
Anatomy-Of_Pub_Sub.pptx		Anatomy-Of_Pub_Sub.pptx
ApacheCon 2021-bookie-usecases.pptx		ApacheCon 2021-bookie-usecases.pptx
ApacheCon2021-schema.pptx		ApacheCon2021-schema.pptx
Batch_to_Near_realtime.pdf		Batch_to_Near_realtime.pdf
ClickHouse_Iceberg-postgres-bonus-slides.pdf		ClickHouse_Iceberg-postgres-bonus-slides.pdf
ClickHouse_Iceberg.pdf		ClickHouse_Iceberg.pdf
Clickhouse-4GB.pdf		Clickhouse-4GB.pdf
Clickhouse-Chronicles-Real-world-war-stories.pdf		Clickhouse-Chronicles-Real-world-war-stories.pdf
EventSourcing-LIve-2021-Streaming-App-Changes.pdf		EventSourcing-LIve-2021-Streaming-App-Changes.pdf
LICENSE		LICENSE
OSACON_Talk_Final_Slides.pdf		OSACON_Talk_Final_Slides.pdf
PGConf2025-Shiv-Slides-3.pdf		PGConf2025-Shiv-Slides-3.pdf
PGConf_2025_slides-1.pdf		PGConf_2025_slides-1.pdf
Pulsar Summit Asia 2022 - Pulsar for Hybrid Cloud Stack.pptx		Pulsar Summit Asia 2022 - Pulsar for Hybrid Cloud Stack.pptx
Pulsar-Summit-Keeping on top of Hybrid-Cloud usage with Apache Pulsar.pdf		Pulsar-Summit-Keeping on top of Hybrid-Cloud usage with Apache Pulsar.pdf
README.md		README.md
Transactions.pdf		Transactions.pdf
clickhouse_rebalancing-v1.pdf		clickhouse_rebalancing-v1.pdf
iceberg_clickhouse_webinar_2.pdf		iceberg_clickhouse_webinar_2.pdf
kubeflow-datacache-slides.pdf		kubeflow-datacache-slides.pdf
osi-building-block-Open-Source_DBs.pptx		osi-building-block-Open-Source_DBs.pptx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

talks

Blog Posts

Open Source Work — Databases, Messaging, and Observability

MySQL (Oracle) — Replication & Binlog work

Features / Enhancements

Bug fixes and correctness work

Apache Pulsar — Client, Schema, and Connector work

Selected contributions

ClickHouse — Observability and performance metrics

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

License

shiv4289/shiv-tech-talks

Folders and files

Latest commit

History

Repository files navigation

talks

Blog Posts

Open Source Work — Databases, Messaging, and Observability

MySQL (Oracle) — Replication & Binlog work

Features / Enhancements

Bug fixes and correctness work

Apache Pulsar — Client, Schema, and Connector work

Selected contributions

ClickHouse — Observability and performance metrics

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages