-
Notifications
You must be signed in to change notification settings - Fork 4
Feature/sof 6598 clean up parsers implementation (DRAFT) #82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
VsevolodX
wants to merge
31
commits into
dev
Choose a base branch
from
feature/SOF-6598-clean
base: dev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
31 commits
Select commit
Hold shift + click to select a range
d0a641c
feat: add fortran parser and parsers settings
VsevolodX 4c53512
update: rename and move formats enum
VsevolodX 225485a
feat: add BaseParser and MaterialsParser classes
VsevolodX cc27cc7
feat: add EspressoParser class and settings
VsevolodX ac86137
feat: add tests for Espresso parser
VsevolodX a3cdc53
feat: add fixtures for Espresso parser
VsevolodX 10f41bf
update: cleanup and correct methods order
VsevolodX ac715d4
feat: add test for fortran parser
VsevolodX e1f2c47
update: change test fixture to the correct one
VsevolodX 931c808
fix: make tests pass
VsevolodX e7a4478
update: shorten function and address PR comments
VsevolodX 65d4872
update: use mixin of FortranParser
VsevolodX e78589d
update: remove node v10 from github cicd
VsevolodX a4269bf
update: temporarily comment out fortranParserMixintest
VsevolodX 7877ab4
update: fix the method called in fortranParserMixin test
VsevolodX 9d87b70
update: move functions inside class
VsevolodX 8acc823
update: change fortran tests directpry
VsevolodX 4030ede
update: address PR comments and move functions into methdos calling them
VsevolodX f8515fb
update: simplify logic and variables addressing PR comments
VsevolodX 1fb0809
feat: add handling for partially missing constraints and test for this
VsevolodX 0dfbe64
update: add JSDocs and change comments FIXME -> TODO
VsevolodX e150e52
update: add and fix test to call ESPRESSO parser from outside
VsevolodX 1c1e1af
update: simplify code and move comments on separate line
VsevolodX 0cf4997
update: rename fortranParserMixin methods to have "fortran" in them
VsevolodX 45b952e
update: change order of regexes to be in descending logic
VsevolodX 45df941
update: rename KV parsing functions and add JSDoc example
VsevolodX f4c2322
feat: add more tests to cover more functions
VsevolodX 22891df
update: fix newly found error of constrains not setting to [] if not …
VsevolodX 2f271c9
update: run npm lint:fix
VsevolodX 3353a73
update: add suppression for ESLint in an abstract class
VsevolodX 3434005
update: add accidentaly removed TODOs back in
VsevolodX File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| export const STRUCTURAL_INFORMATION_FORMATS = { | ||
| JSON: "json", | ||
| POSCAR: "poscar", | ||
| CIF: "cif", | ||
| QE: "qe", | ||
| XYZ: "xyz", | ||
| UNKNOWN: "unknown", | ||
| }; | ||
|
|
||
| export const APPLICATIONS = { | ||
| ESPRESSO: "espresso", | ||
| VASP: "vasp", | ||
| UNKNOWN: "unknown", | ||
| }; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,249 @@ | ||
| import { ATOMIC_COORD_UNITS, coefficients } from "@exabyte-io/code.js/dist/constants"; | ||
| import { mix } from "mixwith"; | ||
|
|
||
| import { primitiveCell } from "../../cell/primitive_cell"; | ||
| import { Lattice } from "../../lattice/lattice"; | ||
| import math from "../../math"; | ||
| import { MaterialParser } from "../structure"; | ||
| import { FortranParserMixin } from "../utils/fortran"; | ||
| import { IBRAV_TO_LATTICE_TYPE_MAP, regex } from "./settings"; | ||
|
|
||
| export class ESPRESSOMaterialParser extends mix(MaterialParser).with(FortranParserMixin) { | ||
| parse(content) { | ||
| this.data = this.fortranParseNamelists(content); | ||
| return this.parseMaterial(); | ||
| } | ||
|
|
||
| /** | ||
| * @summary Return unit cell parameters from CELL_PARAMETERS card | ||
| * @returns {{cell: Number[][], units: String}} | ||
| */ | ||
| getCell() { | ||
| const text = this.data.cards; | ||
| let cell = {}; | ||
| if (this.data.system === undefined) | ||
| throw new Error("No &SYSTEM section found in input this.data."); | ||
| if (this.data.system.ibrav === undefined) throw new Error("ibrav is required in &SYSTEM."); | ||
|
|
||
| if (this.data.system.ibrav === 0) { | ||
| const match = regex.cellParameters.exec(text); | ||
| if (match) { | ||
| const units = match[1]; | ||
| const values = match.slice(2, 11); | ||
| // creating matrix 3 by 3 of numbers from 9 strings | ||
| const vectors = Array.from({ length: 3 }, (_, i) => | ||
| values.slice(i * 3, i * 3 + 3).map(Number), | ||
| ); | ||
| cell = { cell: vectors, units }; | ||
| // TODO: implement type detection, now defaults to TRI | ||
| cell.type = "TRI"; | ||
| return cell; | ||
| } | ||
| } else { | ||
| cell = this.ibravToCellConfig(); | ||
| return cell; | ||
| } | ||
| throw new Error("Couldn't read cell parameters"); | ||
| } | ||
|
|
||
| /** | ||
| * @summary Return elements from ATOMIC_SPECIES card | ||
| * @returns {{id: Number, value: String}[]} | ||
| */ | ||
| getElements() { | ||
| const text = this.data.cards; | ||
| const atomicPositionsMatches = Array.from(text.matchAll(regex.atomicPositions)); | ||
| return atomicPositionsMatches.map((match, index) => ({ | ||
| id: index, | ||
| value: match[1], | ||
| })); | ||
| } | ||
|
|
||
| /** | ||
| * @summary Return atomic positions from ATOMIC_POSITIONS card | ||
| * @returns {{id: Number, value: Number[]}[]} | ||
| */ | ||
| getCoordinates() { | ||
| const text = this.data.cards; | ||
| const atomicPositionsMatches = Array.from(text.matchAll(regex.atomicPositions)); | ||
| const { scalingFactor } = this.getCoordinatesUnitsScalingFactor(); | ||
| return atomicPositionsMatches.map((match, index) => ({ | ||
| id: index, | ||
| value: match.slice(2, 5).map((value) => parseFloat(value) * scalingFactor), | ||
| })); | ||
| } | ||
|
|
||
| /** | ||
| * @summary Return atomic constraints from ATOMIC_POSITIONS card | ||
| * @returns {{id: Number, value: Boolean[]}[]} | ||
| */ | ||
| getConstraints() { | ||
| const text = this.data.cards; | ||
| const atomicPositionsMatches = Array.from(text.matchAll(regex.atomicPositions)); | ||
|
|
||
| const constraints = atomicPositionsMatches.reduce((acc, match, index) => { | ||
| const value = match | ||
| .slice(5, 8) | ||
| .filter((constraint) => constraint !== undefined) | ||
| .map((constraint) => constraint === "1"); // expect only 0 or 1 as valid values | ||
|
|
||
| acc.push({ | ||
| id: index, | ||
| value, | ||
| }); | ||
|
|
||
| return acc; | ||
| }, []); | ||
|
|
||
| // If all constraints are empty, return an empty array | ||
| if (constraints.every((constraint) => constraint.value.length === 0)) { | ||
| return []; | ||
| } | ||
|
|
||
| return constraints; | ||
| } | ||
|
|
||
| /** | ||
| * @summary Return atomic coordinates units from ATOMIC_POSITIONS card | ||
| * @returns {String} | ||
| */ | ||
| getUnits() { | ||
| return this.getCoordinatesUnitsScalingFactor().units; | ||
| } | ||
|
|
||
| /** | ||
| * @summary Return material name from CONTROL card | ||
| * If not present, later will be generated from the formula in materialConfig object | ||
| * @returns {String} | ||
| */ | ||
| getName() { | ||
| return this.data.control.title; | ||
| } | ||
|
|
||
| /** | ||
| * @summary Returns cell config from ibrav and celldm(i) parameters | ||
| * | ||
| * QE docs: https://www.quantum-espresso.org/Doc/INPUT_PW.html#ibrav | ||
| * "If ibrav /= 0, specify EITHER [ celldm(1)-celldm(6) ] | ||
| * OR [ A, B, C, cosAB, cosAC, cosBC ] | ||
| * but NOT both. The lattice parameter "alat" is set to | ||
| * alat = celldm(1) (in a.u.) or alat = A (in Angstrom);" | ||
| * | ||
| * @returns {{cell: Number[][], type: String}} | ||
| */ | ||
| ibravToCellConfig() { | ||
| const { system } = this.data; | ||
| const { celldm } = system; | ||
| let { a, b, c } = system; | ||
| if (celldm && a) { | ||
| throw new Error("Both celldm and A are given"); | ||
| } else if (!celldm && !a) { | ||
| throw new Error("Missing celldm(1)"); | ||
| } | ||
|
|
||
| const type = this.ibravToCellType(); | ||
| [a, b, c] = celldm ? this.getLatticeConstants() : [a, b, c]; | ||
| const [alpha, beta, gamma] = this.getLatticeAngles(); | ||
|
|
||
| const lattice = new Lattice({ | ||
| a, | ||
| b, | ||
| c, | ||
| alpha, | ||
| beta, | ||
| gamma, | ||
| type, | ||
| }); | ||
| const cell = primitiveCell(lattice); | ||
| return { cell, type }; | ||
| } | ||
|
|
||
| /** | ||
| * @summary Converts ibrav value to cell type according to Quantum ESPRESSO docs | ||
| * https://www.quantum-espresso.org/Doc/INPUT_PW.html#ibrav | ||
| * @returns {String} | ||
| */ | ||
| ibravToCellType() { | ||
| const { ibrav } = this.data.system; | ||
| const type = IBRAV_TO_LATTICE_TYPE_MAP[ibrav]; | ||
| if (type === undefined) { | ||
| throw new Error(`Invalid ibrav value: ${ibrav}`); | ||
| } | ||
| return type; | ||
| } | ||
|
|
||
| /** | ||
| * @summary Calculates cell parameters from celldm(i) or A, B, C parameters depending on which are present. Specific to Quantum ESPRESSO. | ||
| * @returns {Number[]} | ||
| * */ | ||
| getLatticeConstants() { | ||
| const { celldm } = this.data.system; | ||
| // celldm indices shifted -1 from fortran list representation. In QE input file celldm(1) list starts with 1, but parsed starting with 0. | ||
| const a = celldm[0] * coefficients.BOHR_TO_ANGSTROM; // celldm(1) is a in bohr | ||
| const b = celldm[1] * celldm[0] * coefficients.BOHR_TO_ANGSTROM; // celldm(2) is b/a | ||
| const c = celldm[2] * celldm[0] * coefficients.BOHR_TO_ANGSTROM; // celldm(3) is c/a | ||
| return [a, b, c]; | ||
| } | ||
|
|
||
| /** | ||
| * @summary Calculates cell angles from celldm(i) or cosAB, cosAC, cosBC parameters. Specific to Quantum ESPRESSO. | ||
| * @returns {Array<Number | undefined>} | ||
| * */ | ||
| getLatticeAngles() { | ||
| const { celldm, cosbc, cosac, cosab } = this.data.system; | ||
| let alpha, beta, gamma; | ||
| if (cosbc) alpha = math.acos(cosbc); | ||
| if (cosac) beta = math.acos(cosac); | ||
| if (cosab) gamma = math.acos(cosab); | ||
|
|
||
| // Case for some of the cell types in QE docs | ||
| // celldm indices shifted -1 from fortran list representation. In QE input file celdm(1) array starts with 1, but parsed starting with 0. | ||
| if (celldm && celldm[3]) { | ||
| gamma = math.acos(celldm[3]); | ||
| } | ||
|
|
||
| // Specific case for hexagonal cell in QE docs | ||
| // celldm indices shifted -1 from fortran list representation. In QE input file celdm(1) array starts with 1, but parsed starting with 0. | ||
| if (celldm && celldm[3] && celldm[4] && celldm[5]) { | ||
| alpha = math.acos(celldm[3]); | ||
| beta = math.acos(celldm[4]); | ||
| gamma = math.acos(celldm[5]); | ||
| } | ||
|
|
||
| // Convert radians to degrees which are used in lattice definitions | ||
| [alpha, beta, gamma] = [alpha, beta, gamma].map((x) => | ||
| x === undefined ? x : (x * 180) / math.PI, | ||
| ); | ||
| return [alpha, beta, gamma]; | ||
| } | ||
|
|
||
| /** | ||
| * @summary Return units and scaling factor according to Quantum ESPRESSO 7.2 docs | ||
| * @returns {{units: String, scalingFactor: Number}} | ||
| */ | ||
| getCoordinatesUnitsScalingFactor() { | ||
| const units = this.data.cards.match(regex.atomicPositionsUnits)[1]; | ||
| let scalingFactor = 1.0; | ||
| let _units; | ||
| switch (units) { | ||
| case "alat": | ||
| _units = ATOMIC_COORD_UNITS.crystal; | ||
| break; | ||
| case "bohr": | ||
| scalingFactor = coefficients.BOHR_TO_ANGSTROM; | ||
| _units = ATOMIC_COORD_UNITS.cartesian; | ||
| break; | ||
| case "angstrom": | ||
| _units = ATOMIC_COORD_UNITS.cartesian; | ||
| break; | ||
| case "crystal": | ||
| _units = ATOMIC_COORD_UNITS.crystal; | ||
| break; | ||
| case "crystal_sg": | ||
| throw new Error("crystal_sg not supported yet"); | ||
| default: | ||
| throw new Error(`Units ${units} not supported`); | ||
| } | ||
| return { units: _units, scalingFactor }; | ||
| } | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,55 @@ | ||
| import { LATTICE_TYPE } from "../../lattice/types"; | ||
| import { regex as commonRegex } from "../utils/settings"; | ||
|
|
||
| const { double } = commonRegex.general; | ||
| export const regex = { | ||
| espressoFingerprint: /&CONTROL|&SYSTEM|ATOMIC_SPECIES/i, | ||
| atomicSpecies: new RegExp( | ||
| "([A-Z][a-z]?)\\s+" + // element symbol Aa | ||
| `(${double})\\s` + // mass | ||
| "(\\S*)\\s*" + // potential source file name | ||
| "(?=\\n)", // end of line | ||
| "gm", | ||
| ), | ||
| atomicPositionsUnits: new RegExp( | ||
| "ATOMIC_POSITIONS\\s+" + // start of card | ||
| "\\(?" + // optional parentheses | ||
| "(\\w+)" + // units | ||
| "\\)?", // end of optional parentheses | ||
| ), | ||
| atomicPositions: new RegExp( | ||
| `^\\s*([A-Z][a-z]*)\\s+` + // atomic element symbol | ||
| `(${double})\\s+(${double})\\s+(${double})` + // atomic coordinates | ||
| `(?:\\s+(0|1)\\s+(0|1)\\s+(0|1))?(?=\\s*\\n)`, // atomic constraints | ||
| "gm", | ||
| ), | ||
| cellParameters: new RegExp( | ||
| `CELL_PARAMETERS\\s*(?:\\(?(\\w+)\\)?)?\\n` + | ||
| `^\\s*(${double})\\s+(${double})\\s+(${double})\\s*\\n` + | ||
| `^\\s*(${double})\\s+(${double})\\s+(${double})\\s*\\n` + | ||
| `^\\s*(${double})\\s+(${double})\\s+(${double})\\s*\\n`, | ||
| "gm", | ||
| ), | ||
| }; | ||
|
|
||
| export const IBRAV_TO_LATTICE_TYPE_MAP = { | ||
| 1: LATTICE_TYPE.CUB, | ||
| 2: LATTICE_TYPE.FCC, | ||
| 3: LATTICE_TYPE.BCC, | ||
| "-3": LATTICE_TYPE.BCC, | ||
| 4: LATTICE_TYPE.HEX, | ||
| 5: LATTICE_TYPE.RHL, | ||
| "-5": LATTICE_TYPE.RHL, | ||
| 6: LATTICE_TYPE.TET, | ||
| 7: LATTICE_TYPE.BCT, | ||
| 8: LATTICE_TYPE.ORC, | ||
| 9: LATTICE_TYPE.ORCC, | ||
| "-9": LATTICE_TYPE.ORCC, | ||
| 10: LATTICE_TYPE.ORCF, | ||
| 11: LATTICE_TYPE.ORCI, | ||
| 12: LATTICE_TYPE.MCL, | ||
| "-12": LATTICE_TYPE.MCL, | ||
| 13: LATTICE_TYPE.MCLC, | ||
| "-13": LATTICE_TYPE.MCLC, | ||
| 14: LATTICE_TYPE.TRI, | ||
| }; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| export class BaseParser { | ||
| constructor(options) { | ||
| this.options = options; | ||
| } | ||
|
|
||
| // eslint-disable-next-line class-methods-use-this | ||
| parse() { | ||
| throw new Error("parse() is implemented in children"); | ||
| } | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.