1- # CSV Tables
1+ # CSV Data
2+
3+
24
35The CSV (Comma-Separated Values) format is exceptionally convenient for data processing.
46It is simple, yet processed efficiently, supported by many analysis and introspection tools,
@@ -98,7 +100,9 @@ erDiagram
98100 }
99101```
100102
101- ## The Key Challenge: Multiple Files = Broken Relationships
103+ ## Combine Multiple Files
104+
105+ The Key Challenge: Multiple Files = Broken Relationships
102106
103107When we have multiple CSV files from different runs or datasets, each file starts its event numbering from 0:
104108
@@ -110,17 +114,16 @@ File 3: evt = [0, 1, 2, 3, 4, ...] ← ID Collision!
110114
111115** Problem** : Event 0 from File 1 is completely different from Event 0 from File 2, but they have the same ID!
112116
113- ## Solution: Global Unique Event IDs
117+ ** Solution** : Global Unique Event IDs
114118
115119We need to create globally unique event IDs across all files:
116120
117121``` python
118122import pandas as pd
119123import glob
120124
121- def concat_csvs_with_unique_events (pattern ):
125+ def concat_csvs_with_unique_events (files ):
122126 """ Load and concatenate CSV files with globally unique event IDs"""
123- files = sorted (glob.glob(pattern))
124127 dfs = []
125128 offset = 0
126129
@@ -133,8 +136,8 @@ def concat_csvs_with_unique_events(pattern):
133136 return pd.concat(dfs, ignore_index = True )
134137
135138# Load both tables with unique event IDs
136- lambda_df = concat_csvs_with_unique_events(" mcpart_lambda*.csv" )
137- dis_df = concat_csvs_with_unique_events(" dis_parameters*.csv" )
139+ lambda_df = concat_csvs_with_unique_events(sorted (glob.glob( " mcpart_lambda*.csv" )) )
140+ dis_df = concat_csvs_with_unique_events(sorted (glob.glob( " dis_parameters*.csv" )) )
138141```
139142
140143** Result** : Now we have globally unique event IDs:
0 commit comments