Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
e4b6245
Add MathMLTools as submodule
AndreG-P Mar 11, 2020
7915fa5
Update CoreNLP to 3.9.2
AndreG-P Apr 20, 2020
150a459
Fix noun phrase sequences
AndreG-P Apr 20, 2020
b4cdadd
Fix unrecognized <math> tag bug in MathConverter
AndreG-P Apr 21, 2020
85bf663
Delete empty mlp folder
AndreG-P Apr 21, 2020
b567ef3
WiP changes (bug fixes and extensions of sweble wikitext parser)
AndreG-P Apr 22, 2020
36b52e9
A lot of progress on sweble parser
AndreG-P Apr 24, 2020
6b79b89
Update knowledge structure of sentences and math expressions
AndreG-P Apr 25, 2020
a664039
First try to switch from identifier to MOI
AndreG-P Apr 27, 2020
75e9d27
Bug fixes in MOI approach
AndreG-P Apr 27, 2020
1b9da84
From identifier to MOI
AndreG-P May 11, 2020
aaf80f1
Find unicode math within normal text and treat it as math
AndreG-P May 12, 2020
c9d497a
Redevelop the WikiTextParser
AndreG-P May 14, 2020
587b7c2
Updating mathosphere to handle MOI
AndreG-P Oct 22, 2020
7d7653c
Update mlp according to updates of LaCASt
AndreG-P Oct 22, 2020
607f69a
core depends on evaluation, so the module order must be fixed
AndreG-P Oct 22, 2020
dc96f56
Fixing java 11 issues
AndreG-P Oct 22, 2020
4b6ef37
Fixing buggy logging in maven-assembly-plugin by massively updating i…
AndreG-P Oct 22, 2020
0fa87f6
Delete LaCASt dependency and bring MLP tests back to CI... since 2015…
AndreG-P Oct 27, 2020
1f904a6
Try fixing numerous of issues when it comes to flink
AndreG-P Oct 29, 2020
57ef56a
Alright, at least fixing the flink-kryo-serialization problems in mat…
AndreG-P Oct 30, 2020
d00a208
Adjusting mathosphere to new version of lacast
AndreG-P Oct 30, 2020
337210a
Making MLP ready by updating MOI-graph structure and access to be use…
AndreG-P Nov 2, 2020
b35e977
Updates according to needs in LaCASt
AndreG-P Nov 10, 2020
98b358c
Minor change to improve performance of merge definiens
AndreG-P Dec 2, 2020
e6f4724
Use merge-definiens also in the new datastructure of dependency trees…
AndreG-P Dec 3, 2020
4eeabca
Allow pure wikitext inputs (without page or text-tags)
AndreG-P Dec 3, 2020
e53849e
Workaround for bug in sweble with references
AndreG-P Dec 4, 2020
7b3a3b7
Several bug fixes and improvements in sweble wikitext parser
AndreG-P Dec 4, 2020
fa91b5f
Improve noun-merging (include possessive endings, determiner, and pre…
AndreG-P Dec 5, 2020
578b06f
Update a save way to compare positions
AndreG-P Dec 6, 2020
1c07b84
Update to handle the correct semantic graph structure of a parsed sen…
AndreG-P Dec 8, 2020
cf225c9
Remove link-replacements in text since they cause incorrect semantic …
AndreG-P Dec 8, 2020
dec42be
Performance improvements
AndreG-P Dec 10, 2020
fe5fdb0
Fix bug that creates multiple MathTags for a single formulae and over…
AndreG-P Dec 11, 2020
c7d79aa
Fix bug in NP merger
AndreG-P Dec 16, 2020
3c0f2d2
Fixing score calculation for candidates
AndreG-P Dec 21, 2020
17bead6
Update a helper function for MathTag and slightly update interfaces o…
AndreG-P Dec 31, 2020
896abea
Fix calculation and math-tag use persistent hash for formulae
AndreG-P Jan 5, 2021
8de0115
Fixing issues in sweble producing invalid tex for math tags
AndreG-P Jan 12, 2021
a1fbcb3
Catch nullpointer in case the indexed word does not exist
AndreG-P Jan 18, 2021
8e080e5
Synchronized methods
AndreG-P Jan 25, 2021
c2e776e
Add LaCASt's TeX pre-processor to MathTags
AndreG-P Jan 25, 2021
b7f47ba
Add minor test case to see if ''alpha''=2 is correctly detected
AndreG-P Jan 29, 2021
808d45c
Merge dashed-nouns
AndreG-P Feb 2, 2021
5952d88
Fix formula in definiens text bug
AndreG-P Feb 5, 2021
2217ead
Try fixing #194
AndreG-P May 20, 2021
9b1b8d9
Fixing more errors in old test cases
AndreG-P Oct 28, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -84,3 +84,7 @@ dependency-reduced-pom.xml
# Introducing the t-standard (copyrights reserved)
t/
/mathosphere-core/test/

# ignore node folder (necessary to run mathoid service locally)
node_modules/

4 changes: 4 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,7 @@
path = lib/GoUldI
url = https://github.com/ag-gipp/GoUldI.git
branch = master
[submodule "lib/MathMLTools"]
path = lib/MathMLTools
url = https://github.com/ag-gipp/MathMLTools.git
branch = master
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
language: java
sudo: required
jdk:
- oraclejdk8
- openjdk11

before_install:
- mvn clean -q
Expand Down
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,3 +68,9 @@ If test fail due to encoding problems in windows, set the environment variable
JAVA_TOOL_OPTIONS = -Dfile.encoding=UTF8
```
as suggested on [stackoverflow.](http://stackoverflow.com/a/28470840)

### Java 11
This project is currently working for Java 11 (theoretically until Java 16 to be precise) but fails for Java 17 or newer.
The reason is that the project uses flink with massive usages of reflections. Since Java 17, many of these reflections
violate the visibility permissions. The errors are not obviously hinting towards visibility issues due to Java 17.
Hence, fixing the issues might be tricky. Simply make sure you run the project with Java 11 for now!
63,692 changes: 63,692 additions & 0 deletions Test Results - com_formulasearchengine_in_mathosphere-core.html

Large diffs are not rendered by default.

12 changes: 9 additions & 3 deletions basex/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,12 @@
<url>https://oss.sonatype.org/service/local/staging/deploy/maven2/</url>
</repository>
</distributionManagement>
<repositories>
<repository>
<id>basex</id>
<url>https://files.basex.org/maven</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>org.basex</groupId>
Expand Down Expand Up @@ -103,12 +109,12 @@
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>2.5.1</version>
<version>2.10.3</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.5.1</version>
<version>2.10.3</version>
</dependency>
<dependency>
<groupId>com.intellij</groupId>
Expand Down Expand Up @@ -152,7 +158,7 @@
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-javadoc-plugin</artifactId>
<version>2.9.1</version>
<version>3.2.0</version>
<executions>
<execution>
<id>attach-javadocs</id>
Expand Down
2 changes: 1 addition & 1 deletion config.dev.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ worker_heap_limit_mb: 250

# Logger info
logging:
level: warn
level: info
# streams:
# # Use gelf-stream -> logstash
# - type: gelf
Expand Down
18 changes: 9 additions & 9 deletions evaluation/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -29,14 +29,14 @@
</developers>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<!-- <plugin>-->
<!-- <groupId>org.apache.maven.plugins</groupId>-->
<!-- <artifactId>maven-compiler-plugin</artifactId>-->
<!-- <configuration>-->
<!-- <source>1.8</source>-->
<!-- <target>1.8</target>-->
<!-- </configuration>-->
<!-- </plugin>-->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
Expand Down Expand Up @@ -73,7 +73,7 @@
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-csv</artifactId>
<version>RELEASE</version>
<version>1.8</version>
</dependency>
<dependency>
<groupId>junit</groupId>
Expand Down
2 changes: 1 addition & 1 deletion lib/GoUldI
1 change: 1 addition & 0 deletions lib/MathMLTools
Submodule MathMLTools added at 1597c6
2 changes: 1 addition & 1 deletion lib/RTED
Submodule RTED updated 1 files
+9 −5 pom.xml
5 changes: 5 additions & 0 deletions mathosphere-core/lacast.config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# This configuration defines the path the library and configuration folder for LaCASt.
# You can either specify the location of this file with a flag: "-config=pathToThisConfig.yaml",
# or you put this file into the local directory or your home directory in ".lacast" folder.
lacast.libs.path: "/home/andreg-p/Projects/LaCASt/libs"
lacast.config.path: "/home/andreg-p/Projects/LaCASt/config"
106 changes: 80 additions & 26 deletions mathosphere-core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
<repositories>
<repository>
<id>www2.ph.ed.ac.uk-releases</id>
<url>http://www2.ph.ed.ac.uk/maven2</url>
<url>https://www2.ph.ed.ac.uk/maven2</url>
</repository>
<repository>
<id>apache.snapshots</id>
Expand All @@ -39,6 +39,11 @@
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>gov.nist.drmf.interpreter</groupId>
<artifactId>interpreter.pom</artifactId>
<version>2.1-SNAPSHOT</version>
</dependency>
<!-- Logging via log4j2 -->
<dependency>
<groupId>org.apache.logging.log4j</groupId>
Expand All @@ -58,11 +63,22 @@
<version>2.10.0</version>
</dependency>

<!--dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.30</version>
</dependency>
<!-- <dependency>-->
<!-- <groupId>org.slf4j</groupId>-->
<!-- <artifactId>slf4j-log4j12</artifactId>-->
<!-- <version>1.7.30</version>-->
<!-- </dependency>-->

<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-slf4j-impl</artifactId>
<version>2.10.0</version>
</dependency-->
</dependency>

<!-- Other things -->
<dependency>
Expand All @@ -87,30 +103,44 @@
<version>9.5.1-6</version>
</dependency>
<dependency>
<groupId>com.formulasearchengine</groupId>
<artifactId>mathmltools</artifactId>
<version>0.3.1-SNAPSHOT</version>
<groupId>com.formulasearchengine.mathmltools</groupId>
<artifactId>mathml-core</artifactId>
<version>2.0.5-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>cz.muni.fi.mir</groupId>
<artifactId>mathml-canonicalizer</artifactId>
<version>1.2-Mathosphere-SNAPSHOT</version>
<version>1.3.1</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.3.2</version>
<version>3.10</version>
</dependency>

<!-- https://mvnrepository.com/artifact/org.eclipse.mylyn.wikitext/wikitext.mediawiki -->
<dependency>
<groupId>org.fusesource.wikitext</groupId>
<artifactId>wikitext-core</artifactId>
<version>1.4</version>
<groupId>org.eclipse.mylyn.wikitext</groupId>
<artifactId>wikitext.mediawiki</artifactId>
<version>0.9.4.I20090220-1600-e3x</version>
</dependency>

<dependency>
<groupId>org.fusesource.wikitext</groupId>
<artifactId>mediawiki-core</artifactId>
<version>1.4</version>
<groupId>org.netbeans.external</groupId>
<artifactId>org-eclipse-mylyn-wikitext-core</artifactId>
<version>RELEASE113</version>
</dependency>

<!-- <dependency>-->
<!-- <groupId>org.fusesource.wikitext</groupId>-->
<!-- <artifactId>wikitext-core</artifactId>-->
<!-- <version>1.4</version>-->
<!-- </dependency>-->
<!-- <dependency>-->
<!-- <groupId>org.fusesource.wikitext</groupId>-->
<!-- <artifactId>mediawiki-core</artifactId>-->
<!-- <version>1.4</version>-->
<!-- </dependency>-->
<dependency>
<groupId>com.beust</groupId>
<artifactId>jcommander</artifactId>
Expand All @@ -119,7 +149,7 @@
<dependency>
<groupId>org.sweble.wikitext</groupId>
<artifactId>swc-engine</artifactId>
<version>3.1.5</version>
<version>3.1.9</version>
</dependency>
<!-- Flink -->
<dependency>
Expand All @@ -129,14 +159,34 @@
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java_2.10</artifactId>
<artifactId>flink-streaming-java_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.12</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.10</artifactId>
<artifactId>flink-runtime-web_2.12</artifactId>
<version>${flink.version}</version>
<scope>compile</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-databind -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.11.3</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-core -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>2.11.3</version>
</dependency>


<dependency>
<groupId>nz.ac.waikato.cms.weka</groupId>
<artifactId>LibSVM</artifactId>
Expand Down Expand Up @@ -167,7 +217,7 @@
<dependency>
<groupId>com.fasterxml.jackson.jr</groupId>
<artifactId>jackson-jr-objects</artifactId>
<version>2.5.0</version>
<version>${jackson.version}</version>
</dependency>

<dependency>
Expand All @@ -180,20 +230,20 @@
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.5.2</version>
<version>4.0.0</version>
</dependency>

<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.5.2</version>
<classifier>models</classifier> <!-- English models -->
<version>4.0.0</version>
<classifier>models</classifier> <!-- basic English models -->
</dependency>

<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.5.2</version>
<version>3.9.2</version>
<classifier>models-german</classifier> <!-- German models -->
</dependency>

Expand Down Expand Up @@ -246,12 +296,13 @@
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-text</artifactId>
<version>1.0</version>
<version>1.8</version>
</dependency>

<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-text</artifactId>
<version>1.0</version>
<groupId>javax.annotation</groupId>
<artifactId>javax.annotation-api</artifactId>
<version>1.3.2</version>
</dependency>
</dependencies>
<developers>
Expand Down Expand Up @@ -281,11 +332,14 @@
<configuration>
<includes>
<include>%regex[.*mathpd.*]</include>
<include>%regex[.*mlp.*]</include>
<include>%regex[.*wikitext.*]</include>
</includes>
</configuration>
</plugin>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<version>3.3.0</version>
<configuration>
<archive>
<manifest>
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
package com.formulasearchengine.mathosphere.mathpd;

import com.formulasearchengine.mathmltools.helper.XMLHelper;
import com.formulasearchengine.mathmltools.mml.CMMLInfo;
import com.formulasearchengine.mathmltools.xmlhelper.NonWhitespaceNodeList;
import com.formulasearchengine.mathmltools.xmlhelper.XMLHelper;
import com.formulasearchengine.mathmltools.xml.NonWhitespaceNodeList;
import com.formulasearchengine.mathosphere.mathpd.distances.earthmover.EarthMoverDistanceWrapper;
import com.formulasearchengine.mathosphere.mathpd.distances.earthmover.JFastEMD;
import com.formulasearchengine.mathosphere.mathpd.distances.earthmover.Signature;
Expand All @@ -21,6 +21,9 @@
import java.text.DecimalFormat;
import java.util.*;

//import com.formulasearchengine.mathmltools.xmlhelper.NonWhitespaceNodeList;
//import com.formulasearchengine.mathmltools.xmlhelper.XMLHelper;

/**
* Created by Felix Hamborg <[email protected]> on 05.12.16.
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
import com.formulasearchengine.mathosphere.mathpd.pojos.ExtractedMathPDDocument;
import com.formulasearchengine.mathosphere.mlp.contracts.CreateCandidatesMapper;
import com.formulasearchengine.mathosphere.mlp.contracts.JsonSerializerMapper;
import com.formulasearchengine.mathosphere.mlp.contracts.TextAnnotatorMapper;
import com.formulasearchengine.mathosphere.mlp.contracts.WikiTextAnnotatorMapper;
import com.formulasearchengine.mathosphere.mlp.pojos.ParsedWikiDocument;
import com.formulasearchengine.mathosphere.mlp.pojos.WikiDocumentOutput;

Expand Down Expand Up @@ -459,7 +459,7 @@ public static DataSource<String> readPreprocessedFile(String pathname, Execution
}

public WikiDocumentOutput outDocFromText(FlinkPdCommandConfig config, String input) throws Exception {
final TextAnnotatorMapper textAnnotatorMapper = new TextAnnotatorMapper(config);
final WikiTextAnnotatorMapper textAnnotatorMapper = new WikiTextAnnotatorMapper(config);
textAnnotatorMapper.open(null);
final CreateCandidatesMapper candidatesMapper = new CreateCandidatesMapper(config);

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
package com.formulasearchengine.mathosphere.mathpd.contracts;

import com.formulasearchengine.mathmltools.xmlhelper.NonWhitespaceNodeList;
import com.formulasearchengine.mathmltools.xml.NonWhitespaceNodeList;
import com.formulasearchengine.mathosphere.mathpd.Distances;
import com.formulasearchengine.mathosphere.mathpd.pojos.ArxivDocument;
import com.formulasearchengine.mathosphere.mathpd.pojos.ExtractedMathPDDocument;
Expand Down
Loading