Skip to content

Commit bf4e449

Browse files
committed
jedec parse
1 parent b37d47a commit bf4e449

File tree

4 files changed

+227
-23
lines changed

4 files changed

+227
-23
lines changed

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -385,4 +385,6 @@ FodyWeavers.xsd
385385

386386
# JetBrains Rider
387387
.idea/
388-
*.sln.iml
388+
*.sln.iml
389+
390+
*.pdf

libnw/jep106.ids

Lines changed: 21 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11

22
# JEP106BM
3-
# Version: 2025.07.17
3+
# Version: 2025.06.12
44
1
55
1 AMD
66
2 AMI
@@ -70,7 +70,7 @@
7070
66 Macronix
7171
67 Xerox
7272
68 Plus Logic
73-
69 SanDisk (Western Digital)
73+
69 SanDisk Technologies Inc
7474
70 Elan Circuit Tech.
7575
71 European Silicon Str.
7676
72 Apple Computer
@@ -151,7 +151,7 @@
151151
20 Smart Modular
152152
21 Hughes Aircraft
153153
22 Lanstar Semiconductor
154-
23 Marvell (Qlogic)
154+
23 Qlogic
155155
24 Kingston
156156
25 Music Semi
157157
26 Ericsson Components
@@ -458,7 +458,7 @@
458458
73 Ritek Corp
459459
74 empowerTel Networks
460460
75 Hypertec
461-
76 Marvell (Cavium Networks)
461+
76 Cavium Networks
462462
77 PLX Technology
463463
78 Massana Design
464464
79 Intrinsity
@@ -487,7 +487,7 @@
487487
102 Layer N Networks
488488
103 MtekVision (Atsana)
489489
104 Allegro Networks
490-
105 Marvell
490+
105 Marvell Semiconductors
491491
106 Netergy Microelectronic
492492
107 NVIDIA
493493
108 Internet Machines
@@ -560,7 +560,7 @@
560560
48 OCZ
561561
49 Emuzed
562562
50 LOGIC Devices
563-
51 Marvell (Inphi)
563+
51 Inphi Corporation
564564
52 Quake Technologies
565565
53 Vixel
566566
54 SolusTek
@@ -750,7 +750,7 @@
750750
111 MetaRAM
751751
112 Axel Electronics Co Ltd
752752
113 Tilera Corporation
753-
114 Marvell (Aquantia)
753+
114 Aquantia
754754
115 Vivace Semiconductor
755755
116 Redpine Signals
756756
117 Octalica
@@ -971,7 +971,7 @@
971971
78 Mustang
972972
79 Orca Systems
973973
80 Passif Semiconductor
974-
81 GigaDevice Semiconductor (Beijing) Inc
974+
81 GigaDevice Semiconductor (Beijing)
975975
82 Memphis Electronic
976976
83 Beckhoff Automation GmbH
977977
84 Harmony Semiconductor Corp
@@ -1142,7 +1142,7 @@
11421142
122 AltoBeam
11431143
123 Wave Computing
11441144
124 Beijing TrustNet Technology Co Ltd
1145-
125 Marvell (Innovium)
1145+
125 Innovium Inc
11461146
126 Starsway Technology Limited
11471147
10
11481148
1 Weltronics Co LTD
@@ -1375,7 +1375,7 @@
13751375
101 Esperanto Technologies
13761376
102 JinSheng Electronic (Shenzhen) Co Ltd
13771377
103 Shenzhen Shi Bolunshuai Technology
1378-
104 Shanghai Rei Zuan Information Tech
1378+
104 Shanghai Ruixuan Information Tech
13791379
105 Fraunhofer IIS
13801380
106 Kandou Bus SA
13811381
107 Acer
@@ -1623,7 +1623,7 @@
16231623
95 Sitrus Technology
16241624
96 AnHui Conner Storage Co Ltd
16251625
97 Rochester Electronics
1626-
98 Wuxi Smart Memories Technologies Co Ltd
1626+
98 Wuxi Smart Memories Technologies Co
16271627
99 Star Memory
16281628
100 Agile Memory Technology Co Ltd
16291629
101 MEJEC
@@ -1658,9 +1658,9 @@
16581658
3 Shenzhen Feisrike Technology Co Ltd
16591659
4 Shenzhen Sunhome Electronics Co Ltd
16601660
5 Global Mixed-mode Technology Inc
1661-
6 Shenzhen Weien Electronics Co. Ltd.
1661+
6 Shenzhen Weien Electronics Co Ltd.
16621662
7 Shenzhen Cooyes Technology Co Ltd
1663-
8 ShenZhen ChaoYing ZhiNeng
1663+
8 ShenZhen ChaoYing ZhiNeng Technology
16641664
9 E-Rockic Technology Company Limited
16651665
10 Aerospace Science Memory Shenzhen
16661666
11 Shenzhen Quanji Technology Co Ltd
@@ -1804,7 +1804,7 @@
18041804
22 Chiplego Technology (Shanghai) Co Ltd
18051805
23 StoreSkill
18061806
24 Shenzhen Astou Technology Company
1807-
25 Guangdong LeafFive Technology Limited
1807+
25 Guangdong LeapFive Technology Limited
18081808
26 Jin JuQuan
18091809
27 Huaxuan Technology (Shenzhen) Co Ltd
18101810
28 Gigastone Corporation
@@ -1872,12 +1872,12 @@
18721872
90 SSCT
18731873
91 Sichuan Heentai Semiconductor Co Ltd
18741874
92 Zhejiang University
1875-
93 Guangzhou ShinGroup
1875+
93 www.shingroup.cn
18761876
94 Suzhou Nano Mchip Technology Company
18771877
95 Feature Integration Technology Inc
18781878
96 d-Matrix
18791879
97 Golden Memory
1880-
98 MACHENIKE
1880+
98 Qingdao Thunderobot Technology Co Ltd
18811881
99 Shenzhen Tianxiang Chuangxin Technology
18821882
100 HYPHY USA
18831883
101 Valkyrie
@@ -1899,12 +1899,12 @@
18991899
117 HOGE Technology Co Ltd
19001900
118 United Micro Technology (Shenzhen) Co
19011901
119 Fabric of Truth Inc
1902-
120 Epitech
1902+
120 Elpitech
19031903
121 Elitestek
19041904
122 Cornelis Networks Inc
19051905
123 WingSemi Technologies Co Ltd
19061906
124 ForwardEdge ASIC
1907-
125 Beijing Future Imprint Technology Co Ltd
1907+
125 Beijing Future Signet Technology Co Ltd
19081908
126 Fine Made Microelectronics Group Co Ltd
19091909
16
19101910
1 Changxin Memory Technology (Shanghai)
@@ -1923,16 +1923,16 @@
19231923
14 Shenzhen Ranshuo Technology Co Limited
19241924
15 ScaleFlux
19251925
16 XC Memory
1926-
17 Guangzhou Beimu Technology Co., Ltd
1926+
17 Guangzhou Beimu Technology Co Ltd
19271927
18 Rays Semiconductor Nanjing Co Ltd
19281928
19 Milli-Centi Intelligence Technology Jiangsu
1929-
20 Zilia Technologioes
1929+
20 Zilia Technologies
19301930
21 Incore Semiconductors
19311931
22 Kinetic Technologies
19321932
23 Nanjing Houmo Technology Co Ltd
19331933
24 Suzhou Yige Technology Co Ltd
19341934
25 Shenzhen Techwinsemi Technology Co Ltd
1935-
26 Pure Array Technology (Shanghai) Co. Ltd
1935+
26 Pure Array Technology (Shanghai) Co Ltd
19361936
27 Shenzhen Techwinsemi Technology Udstore
19371937
28 RISE MODE
19381938
29 NEWREESTAR

libnw/version.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
#define NWINFO_MAJOR_VERSION 1
99
#define NWINFO_MINOR_VERSION 4
1010
#define NWINFO_MICRO_VERSION 2
11-
#define NWINFO_BUILD_VERSION 0
11+
#define NWINFO_BUILD_VERSION 1
1212

1313
#define NWINFO_VERSION NWINFO_MAJOR_VERSION,NWINFO_MINOR_VERSION,NWINFO_MICRO_VERSION,NWINFO_BUILD_VERSION
1414
#define NWINFO_VERSION_STR QUOTE(NWINFO_MAJOR_VERSION.NWINFO_MINOR_VERSION.NWINFO_MICRO_VERSION.NWINFO_BUILD_VERSION)

parse_jep106.py

Lines changed: 202 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,202 @@
1+
#!/usr/bin/env python
2+
# -*- coding: utf-8 -*-
3+
4+
import sys
5+
import re
6+
import datetime
7+
8+
# PyMuPDF is imported as fitz
9+
try:
10+
import fitz
11+
except ImportError:
12+
print("FATAL: PyMuPDF library not found.")
13+
print("--> Please install it by running: pip install pymupdf")
14+
sys.exit(1)
15+
16+
START_PAGE_INDEX = 5
17+
DATE_SEARCH_PAGE_INDEX = 4
18+
19+
def extract_document_name(doc):
20+
"""Extracts the document identifier (e.g., JEP106BM) from the PDF."""
21+
print("[INFO] Extracting document name...")
22+
# Attempt to find in metadata first
23+
meta_title = doc.metadata.get('title', '')
24+
match = re.search(r'(JEP106[A-Z]{2})', meta_title, re.IGNORECASE)
25+
if match:
26+
name = match.group(1).upper()
27+
print(" |-- Found in metadata: '{}'".format(name))
28+
return name
29+
30+
# Fallback to scanning the cover page
31+
try:
32+
print(" |-- Not in metadata, scanning cover page...")
33+
cover_page_text = doc.load_page(0).get_text("text")
34+
match = re.search(r'(JEP106[A-Z]{2})', cover_page_text, re.IGNORECASE)
35+
if match:
36+
name = match.group(1).upper()
37+
print(" |-- Found on cover page: '{}'".format(name))
38+
return name
39+
except Exception as e:
40+
print(" |-- [WARN] Error scanning cover page: {}".format(e))
41+
42+
print(" |-- [WARN] Could not find name. Using default 'JEP106BM'.")
43+
return "JEP106BM"
44+
45+
def extract_document_date(doc):
46+
"""Scans a specific page of the PDF to find the document's effective date."""
47+
print("[INFO] Extracting document date from page {}...".format(DATE_SEARCH_PAGE_INDEX + 1))
48+
try:
49+
page = doc.load_page(DATE_SEARCH_PAGE_INDEX)
50+
text = page.get_text("text")
51+
52+
match = re.search(r'The present list is complete as of\s+(.*?)\.', text, re.IGNORECASE)
53+
if match:
54+
date_str = match.group(1).strip()
55+
dt_obj = datetime.datetime.strptime(date_str, "%B %d, %Y")
56+
formatted_date = dt_obj.strftime("%Y.%m.%d")
57+
print(" |-- Found and parsed date: '{}'".format(formatted_date))
58+
return formatted_date
59+
except Exception as e:
60+
print(" |-- [WARN] Failed to parse date: {}".format(e))
61+
62+
print(" |-- [WARN] Could not extract date. Using current date as fallback.")
63+
return datetime.date.today().strftime("%Y.%m.%d")
64+
65+
def clean_manufacturer_name(raw_name):
66+
"""
67+
Cleans the raw manufacturer name by stripping trailing table data and
68+
normalizing internal whitespace.
69+
"""
70+
# " 1 1 0 0 0 1 1 1 C7"
71+
cleaned_name = re.sub(r'(\s+[01])+(\s+[0-9A-Fa-f]{2})?$', '', raw_name)
72+
73+
# Replace sequences of one or more whitespace characters with a single space.
74+
cleaned_name = re.sub(r'\s+', ' ', cleaned_name)
75+
76+
# Handle non-ASCII punctuation
77+
replacements = {
78+
'\u2019': "'", # Right Single Quotation Mark -> Apostrophe
79+
'\u2018': "'", # Left Single Quotation Mark -> Apostrophe
80+
'\u201d': '"', # Right Double Quotation Mark -> Quotation Mark
81+
'\u201c': '"', # Left Double Quotation Mark -> Quotation Mark
82+
'\u2014': '-', # Em Dash -> Hyphen
83+
'\u2013': '-', # En Dash -> Hyphen
84+
}
85+
86+
for old, new in replacements.items():
87+
cleaned_name = cleaned_name.replace(old, new)
88+
89+
return cleaned_name.strip()
90+
91+
def parse_jep106_pdf(input_path, output_path):
92+
"""
93+
Parses the JEP106 PDF file.
94+
"""
95+
print("--- JEP106 Parser Started ---")
96+
print("[INFO] Input PDF: {}".format(input_path))
97+
print("[INFO] Output file: {}".format(output_path))
98+
99+
try:
100+
doc = fitz.open(input_path)
101+
print("[OK] PDF file opened successfully ({} pages).".format(len(doc)))
102+
except Exception as e:
103+
print("FATAL: Failed to open or read the PDF file '{}'.".format(input_path))
104+
print("--> Error: {}".format(e))
105+
return
106+
107+
output_lines = []
108+
output_lines.append('')
109+
110+
# Header generation
111+
output_lines.append("# {}".format(extract_document_name(doc)))
112+
output_lines.append("# Version: {}".format(extract_document_date(doc)))
113+
print("[OK] File header generated.")
114+
115+
current_bank = 0
116+
manufacturer_count = 0
117+
line_pattern = re.compile(r'^(\d{1,3})\s+(.*)')
118+
print("\n--- Starting Parsing ---")
119+
120+
for page_num in range(START_PAGE_INDEX, len(doc)):
121+
print("\n[PAGE {}/{}]".format(page_num + 1, len(doc)))
122+
page = doc.load_page(page_num)
123+
text = page.get_text("text")
124+
125+
# Check for the start of the appendix to stop parsing.
126+
if "Annex A (informative) Name Changes" in text:
127+
print(" [STOP] Detected start of Annex A. Terminating main content parsing.")
128+
break
129+
130+
lines = text.split('\n')
131+
132+
if page_num == START_PAGE_INDEX and current_bank == 0:
133+
current_bank = 1
134+
output_lines.append(str(current_bank))
135+
print(" -> Initialized to Bank {}.".format(current_bank))
136+
137+
# Check for bank switch text before parsing lines to correctly associate all entries
138+
if "The following numbers are all in bank" in text:
139+
current_bank += 1
140+
output_lines.append(str(current_bank))
141+
print(" -> Detected switch to Bank {}.".format(current_bank))
142+
143+
i = 0
144+
while i < len(lines):
145+
line = lines[i]
146+
line_stripped = line.strip()
147+
148+
# Immediately prepare for the next iteration.
149+
i += 1
150+
151+
if not line_stripped:
152+
continue
153+
154+
match = line_pattern.match(line_stripped)
155+
if match:
156+
id_code, raw_name = match.groups()
157+
158+
# Look ahead to the next line for a possible continuation.
159+
# A continuation line is a non-empty line that does NOT start with another ID.
160+
if i < len(lines): # Check if a next line exists.
161+
next_line_stripped = lines[i].strip()
162+
# Use regex to check if the next line is a continuation.
163+
if next_line_stripped and not line_pattern.match(next_line_stripped):
164+
# It's a continuation. Append it to the raw name.
165+
raw_name = f"{raw_name} {next_line_stripped}"
166+
# We have consumed the next line, so advance the index again.
167+
i += 1
168+
169+
# Skip entries that are not actual manufacturers
170+
if "Continuation Code" in raw_name:
171+
print(" [SKIP] 'Continuation Code' entry.")
172+
continue
173+
174+
final_name = clean_manufacturer_name(raw_name)
175+
output_lines.append("\t{} {}".format(id_code, final_name))
176+
manufacturer_count += 1
177+
178+
print(" [OK] ID: {:<4} Name: {}".format(id_code, final_name))
179+
180+
print("\n--- Parsing Complete ---")
181+
print("[INFO] Total manufacturers found: {}".format(manufacturer_count))
182+
print("[INFO] Total banks processed: {}".format(current_bank))
183+
print("[INFO] Writing {} final lines to '{}'...".format(len(output_lines), output_path))
184+
185+
try:
186+
with open(output_path, 'w', encoding='ascii', errors='strict') as f:
187+
f.write('\n'.join(output_lines))
188+
f.write('\n')
189+
print("\n--- SUCCESS ---")
190+
print("Generated file: '{}'".format(output_path))
191+
except Exception as e:
192+
print("\n--- FATAL ERROR ---")
193+
print("Failed to write to the output file '{}'.".format(output_path))
194+
print("--> Character Encoding Error or I/O issue.")
195+
print("--> Detailed Error: {}".format(e))
196+
197+
if __name__ == '__main__':
198+
if len(sys.argv) != 3:
199+
print("Usage: python {} <input_pdf_file> <output_ids_file>".format(sys.argv[0]))
200+
sys.exit(1)
201+
202+
parse_jep106_pdf(sys.argv[1], sys.argv[2])

0 commit comments

Comments
 (0)