FLAG_SET_*
macros can change more than one flag (in
diff --git a/doc/hotspot-unit-tests.md b/doc/hotspot-unit-tests.md
index e1222baa2e3a4..69a9530710987 100644
--- a/doc/hotspot-unit-tests.md
+++ b/doc/hotspot-unit-tests.md
@@ -106,7 +106,7 @@ Prefer having checks inside test code.
Not only does having test logic outside, e.g. verification method,
depending on asserts in product code contradict with several items
-above but also decreases test’s readability and stability. It is much
+above but also decreases test's readability and stability. It is much
easier to understand that a test is testing when all testing logic is
located inside a test or nearby in shared test libraries. As a rule of
thumb, the closer a check to a test, the better.
@@ -119,7 +119,7 @@ Prefer `EXPECT` over `ASSERT` if possible.
This is related to the [informativeness](#informativeness) property of
tests, information for other checks can help to better localize a
-defect’s root-cause. One should use `ASSERT` if it is impossible to
+defect's root-cause. One should use `ASSERT` if it is impossible to
continue test execution or if it does not make much sense. Later in
the text, `EXPECT` forms will be used to refer to both
`ASSERT/EXPECT`.
@@ -160,7 +160,7 @@ value of the difference between `v1` and `v2` is not greater than `eps`.
Use string special macros for C strings comparisons.
-`EXPECT_EQ` just compares pointers’ values, which is hardly what one
+`EXPECT_EQ` just compares pointers' values, which is hardly what one
wants comparing C strings. GoogleTest provides `EXPECT_STREQ` and
`EXPECT_STRNE` macros to compare C string contents. There are also
case-insensitive versions `EXPECT_STRCASEEQ`, `EXPECT_STRCASENE`.
@@ -226,7 +226,7 @@ subsystem, etc.
This naming scheme helps to find tests, filter them and simplifies
test failure analysis. For example, class `Foo` - test group `Foo`,
-compiler logging subsystem - test group `CompilerLogging`, G1 GC — test
+compiler logging subsystem - test group `CompilerLogging`, G1 GC - test
group `G1GC`, and so forth.
### Filename
@@ -287,7 +287,7 @@ Fixture classes should be named after tested classes, subsystems, etc
All test purpose friends should have either `Test` or `Testable` suffix.
-It greatly simplifies understanding of friendship’s purpose and allows
+It greatly simplifies understanding of friendship's purpose and allows
statically check that private members are not exposed unexpectedly.
Having `FooTest` as a friend of `Foo` without any comments will be
understood as a necessary evil to get testability.
@@ -397,7 +397,7 @@ and filter out inapplicable tests.
Restore changed flags.
It is quite common for tests to configure JVM in a certain way
-changing flags’ values. GoogleTest provides two ways to set up
+changing flags' values. GoogleTest provides two ways to set up
environment before a test and restore it afterward: using either
constructor and destructor or `SetUp` and `TearDown` functions. Both ways
require to use a test fixture class, which sometimes is too wordy. The
@@ -406,7 +406,7 @@ be used in such cases to restore/set values.
Caveats:
-* Changing a flag’s value could break the invariants between flags' values and hence could lead to unexpected/unsupported JVM state.
+* Changing a flag's value could break the invariants between flags' values and hence could lead to unexpected/unsupported JVM state.
* `FLAG_SET_*` macros can change more than one flag (in order to
maintain invariants) so it is hard to predict what flags will be
diff --git a/doc/testing.html b/doc/testing.html
index 6285fab1682e4..c1c0f15fed955 100644
--- a/doc/testing.html
+++ b/doc/testing.html
@@ -411,6 +411,13 @@
JCOV
special target jcov-test
instead of test
, e.g.
make jcov-test TEST=jdk_lang
. This will make sure the JCov
image is built, and that JCov reporting is enabled.
+To include JCov coverage for just a subset of all modules, you can
+use the --with-jcov-modules
arguments to
+configure
, e.g.
+--with-jcov-modules=jdk.compiler,java.desktop
.
+For more fine-grained control, you can pass arbitrary filters to JCov
+using --with-jcov-filters
, and you can specify a specific
+JDK to instrument using --with-jcov-input-jdk
.
The JCov report is stored in
build/$BUILD/test-results/jcov-output/report
.
Please note that running with JCov reporting can be very memory
diff --git a/doc/testing.md b/doc/testing.md
index 351690c5e601c..f6a25457b5eae 100644
--- a/doc/testing.md
+++ b/doc/testing.md
@@ -345,6 +345,14 @@ The simplest way to run tests with JCov coverage report is to use the special
target `jcov-test` instead of `test`, e.g. `make jcov-test TEST=jdk_lang`. This
will make sure the JCov image is built, and that JCov reporting is enabled.
+To include JCov coverage for just a subset of all modules, you can use the
+`--with-jcov-modules` arguments to `configure`, e.g.
+`--with-jcov-modules=jdk.compiler,java.desktop`.
+
+For more fine-grained control, you can pass arbitrary filters to JCov using
+`--with-jcov-filters`, and you can specify a specific JDK to instrument
+using `--with-jcov-input-jdk`.
+
The JCov report is stored in `build/$BUILD/test-results/jcov-output/report`.
Please note that running with JCov reporting can be very memory intensive.
diff --git a/make/Bundles.gmk b/make/Bundles.gmk
index 58950b5fb1f71..ba8ec0c864b0e 100644
--- a/make/Bundles.gmk
+++ b/make/Bundles.gmk
@@ -174,9 +174,11 @@ else
JRE_IMAGE_HOMEDIR := $(JRE_IMAGE_DIR)
JDK_BUNDLE_SUBDIR := jdk-$(VERSION_NUMBER)
JRE_BUNDLE_SUBDIR := jre-$(VERSION_NUMBER)
+ STATIC_JDK_BUNDLE_SUBDIR := static-jdk-$(VERSION_NUMBER)
ifneq ($(DEBUG_LEVEL), release)
JDK_BUNDLE_SUBDIR := $(JDK_BUNDLE_SUBDIR)/$(DEBUG_LEVEL)
JRE_BUNDLE_SUBDIR := $(JRE_BUNDLE_SUBDIR)/$(DEBUG_LEVEL)
+ STATIC_JDK_BUNDLE_SUBDIR := $(STATIC_JDK_BUNDLE_SUBDIR)/$(DEBUG_LEVEL)
endif
# In certain situations, the JDK_IMAGE_DIR points to an image without the
# the symbols and demos. If so, the symobls and demos can be found in a
@@ -242,7 +244,10 @@ ifneq ($(filter product-bundles% legacy-bundles, $(MAKECMDGOALS)), )
)
JDK_SYMBOLS_BUNDLE_FILES := \
- $(call FindFiles, $(SYMBOLS_IMAGE_DIR))
+ $(filter-out \
+ %.stripped.pdb, \
+ $(call FindFiles, $(SYMBOLS_IMAGE_DIR)) \
+ )
TEST_DEMOS_BUNDLE_FILES := $(filter $(JDK_DEMOS_IMAGE_HOMEDIR)/demo/%, \
$(ALL_JDK_DEMOS_FILES))
@@ -497,6 +502,21 @@ ifneq ($(filter static-libs-graal-bundles, $(MAKECMDGOALS)), )
STATIC_LIBS_GRAAL_TARGETS += $(BUILD_STATIC_LIBS_GRAAL_BUNDLE)
endif
+#################################################################################
+
+ifneq ($(filter static-jdk-bundles, $(MAKECMDGOALS)), )
+ STATIC_JDK_BUNDLE_FILES := $(call FindFiles, $(STATIC_JDK_IMAGE_DIR))
+
+ $(eval $(call SetupBundleFile, BUILD_STATIC_JDK_BUNDLE, \
+ BUNDLE_NAME := $(STATIC_JDK_BUNDLE_NAME), \
+ FILES := $(STATIC_JDK_BUNDLE_FILES), \
+ BASE_DIRS := $(STATIC_JDK_IMAGE_DIR), \
+ SUBDIR := $(STATIC_JDK_BUNDLE_SUBDIR), \
+ ))
+
+ STATIC_JDK_TARGETS += $(BUILD_STATIC_JDK_BUNDLE)
+endif
+
################################################################################
product-bundles: $(PRODUCT_TARGETS)
@@ -507,11 +527,12 @@ docs-javase-bundles: $(DOCS_JAVASE_TARGETS)
docs-reference-bundles: $(DOCS_REFERENCE_TARGETS)
static-libs-bundles: $(STATIC_LIBS_TARGETS)
static-libs-graal-bundles: $(STATIC_LIBS_GRAAL_TARGETS)
+static-jdk-bundles: $(STATIC_JDK_TARGETS)
jcov-bundles: $(JCOV_TARGETS)
.PHONY: product-bundles test-bundles \
docs-jdk-bundles docs-javase-bundles docs-reference-bundles \
- static-libs-bundles static-libs-graal-bundles jcov-bundles
+ static-libs-bundles static-libs-graal-bundles static-jdk-bundles jcov-bundles
################################################################################
diff --git a/make/CompileInterimLangtools.gmk b/make/CompileInterimLangtools.gmk
index c869ea160c76d..4a2bbaec5b854 100644
--- a/make/CompileInterimLangtools.gmk
+++ b/make/CompileInterimLangtools.gmk
@@ -95,7 +95,7 @@ define SetupInterimModule
SRC := $(BUILDTOOLS_OUTPUTDIR)/gensrc/$1.interim \
$$(wildcard $(SUPPORT_OUTPUTDIR)/gensrc/$1) \
$(TOPDIR)/src/$1/share/classes, \
- EXCLUDES := sun javax/tools/snippet-files, \
+ EXCLUDES := sun, \
EXCLUDE_FILES := $(TOPDIR)/src/$1/share/classes/module-info.java \
$(TOPDIR)/src/$1/share/classes/javax/tools/ToolProvider.java \
$(TOPDIR)/src/$1/share/classes/com/sun/tools/javac/launcher/Main.java \
@@ -103,6 +103,7 @@ define SetupInterimModule
$(TOPDIR)/src/$1/share/classes/com/sun/tools/javac/launcher/MemoryModuleFinder.java \
$(TOPDIR)/src/$1/share/classes/com/sun/tools/javac/launcher/SourceLauncher.java \
Standard.java, \
+ EXCLUDE_PATTERNS := -files, \
EXTRA_FILES := $(BUILDTOOLS_OUTPUTDIR)/gensrc/$1.interim/module-info.java \
$($1.interim_EXTRA_FILES), \
COPY := .gif .png .xml .css .svg .js .js.template .txt .woff .woff2 javax.tools.JavaCompilerTool, \
diff --git a/make/CompileJavaModules.gmk b/make/CompileJavaModules.gmk
index b4a193dfadee6..1e26fb2b529cc 100644
--- a/make/CompileJavaModules.gmk
+++ b/make/CompileJavaModules.gmk
@@ -113,6 +113,7 @@ $(eval $(call SetupJavaCompilation, $(MODULE), \
DISABLED_WARNINGS := $(DISABLED_WARNINGS_java), \
EXCLUDES := $(EXCLUDES), \
EXCLUDE_FILES := $(EXCLUDE_FILES), \
+ EXCLUDE_PATTERNS := -files, \
KEEP_ALL_TRANSLATIONS := $(KEEP_ALL_TRANSLATIONS), \
JAVAC_FLAGS := \
$(DOCLINT) \
diff --git a/make/Coverage.gmk b/make/Coverage.gmk
index 2fd4e4ec6d454..a375c343185e8 100644
--- a/make/Coverage.gmk
+++ b/make/Coverage.gmk
@@ -34,21 +34,28 @@ else
JCOV_INPUT_IMAGE_DIR := $(JDK_IMAGE_DIR)
endif
+JCOV_SUPPORT_DIR := $(SUPPORT_OUTPUTDIR)/jcov
+
#moving instrumented jdk image in and out of jcov_temp because of CODETOOLS-7902299
-JCOV_TEMP := $(SUPPORT_OUTPUTDIR)/jcov_temp
+JCOV_TEMP := $(JCOV_SUPPORT_DIR)/temp
+
+ifneq ($(JCOV_MODULES), )
+ JCOV_MODULES_FILTER := $(foreach m, $(JCOV_MODULES), -include_module $m)
+endif
$(JCOV_IMAGE_DIR)/release: $(JCOV_INPUT_IMAGE_DIR)/release
$(call LogWarn, Creating instrumented jdk image with JCov)
$(call MakeDir, $(JCOV_TEMP) $(IMAGES_OUTPUTDIR))
$(RM) -r $(JCOV_IMAGE_DIR) $(JCOV_TEMP)/*
$(CP) -r $(JCOV_INPUT_IMAGE_DIR) $(JCOV_TEMP)/$(JCOV_IMAGE_SUBDIR)
- $(JAVA) -Xmx3g -jar $(JCOV_HOME)/lib/jcov.jar JREInstr \
+ $(call ExecuteWithLog, $(JCOV_SUPPORT_DIR)/run-jcov, \
+ $(JAVA) -Xmx3g -jar $(JCOV_HOME)/lib/jcov.jar JREInstr \
-t $(JCOV_TEMP)/$(JCOV_IMAGE_SUBDIR)/template.xml \
-rt $(JCOV_HOME)/lib/jcov_network_saver.jar \
-exclude 'java.lang.Object' \
-exclude jdk.test.Main -exclude '**\$Proxy*' \
- $(JCOV_FILTERS) \
- $(JCOV_TEMP)/$(JCOV_IMAGE_SUBDIR)
+ $(JCOV_MODULES_FILTER) $(JCOV_FILTERS) \
+ $(JCOV_TEMP)/$(JCOV_IMAGE_SUBDIR))
$(MV) $(JCOV_TEMP)/$(JCOV_IMAGE_SUBDIR) $(JCOV_IMAGE_DIR)
$(RMDIR) $(JCOV_TEMP)
diff --git a/make/Docs.gmk b/make/Docs.gmk
index 49c97946f7531..4733b22b1ad4c 100644
--- a/make/Docs.gmk
+++ b/make/Docs.gmk
@@ -79,6 +79,8 @@ JAVADOC_TAGS := \
-tag see \
-taglet build.tools.taglet.ExtLink \
-taglet build.tools.taglet.Incubating \
+ -taglet build.tools.taglet.PreviewNote \
+ --preview-note-tag previewNote \
-tagletpath $(BUILDTOOLS_OUTPUTDIR)/jdk_tools_classes \
$(CUSTOM_JAVADOC_TAGS) \
#
@@ -92,20 +94,16 @@ REFERENCE_TAGS := $(JAVADOC_TAGS)
JAVADOC_DISABLED_DOCLINT_WARNINGS := missing
JAVADOC_DISABLED_DOCLINT_PACKAGES := org.w3c.* javax.smartcardio
-# Allow overriding on the command line
-# (intentionally sharing name with the javac option)
-JAVA_WARNINGS_ARE_ERRORS ?= -Werror
-
# The initial set of options for javadoc
JAVADOC_OPTIONS := -use -keywords -notimestamp \
- -serialwarn -encoding ISO-8859-1 -docencoding UTF-8 -breakiterator \
+ -serialwarn -encoding utf-8 -docencoding utf-8 -breakiterator \
-splitIndex --system none -javafx --expand-requires transitive \
- --override-methods=summary
+ --override-methods=summary --syntax-highlight
# The reference options must stay stable to allow for comparisons across the
# development cycle.
REFERENCE_OPTIONS := -XDignore.symbol.file=true -use -keywords -notimestamp \
- -serialwarn -encoding ISO-8859-1 -breakiterator -splitIndex --system none \
+ -serialwarn -encoding utf-8 -breakiterator -splitIndex --system none \
-html5 -javafx --expand-requires transitive
# Should we add DRAFT stamps to the generated javadoc?
@@ -264,7 +262,7 @@ define create_overview_file
$$($1_OVERVIEW): $$($1_OVERVIEW_VARDEPS_FILE)
$$(call LogInfo, Creating overview.html for $1)
$$(call MakeDir, $$(@D))
- $$(PRINTF) > $$@ '$$($1_OVERVIEW_TEXT)'
+ $$(PRINTF) "%s" '$$($1_OVERVIEW_TEXT)' > $$@
endef
################################################################################
@@ -322,7 +320,9 @@ define SetupApiDocsGenerationBody
# Ignore the doclint warnings in certain packages
$1_OPTIONS += -Xdoclint/package:$$(call CommaList, $$(addprefix -, \
$$(JAVADOC_DISABLED_DOCLINT_PACKAGES)))
- $1_OPTIONS += $$(JAVA_WARNINGS_ARE_ERRORS)
+ ifeq ($$(JAVA_WARNINGS_AS_ERRORS), true)
+ $1_OPTIONS += -Werror
+ endif
$1_DOC_TITLE := $$($1_LONG_NAME) Version $$(VERSION_SPECIFICATION) API \
Specification
diff --git a/make/RunTestsPrebuiltFindTests.gmk b/make/GenerateFindTests.gmk
similarity index 100%
rename from make/RunTestsPrebuiltFindTests.gmk
rename to make/GenerateFindTests.gmk
diff --git a/make/GenerateLinkOptData.gmk b/make/GenerateLinkOptData.gmk
index 5fc745ba223f4..1f52b88e1ef77 100644
--- a/make/GenerateLinkOptData.gmk
+++ b/make/GenerateLinkOptData.gmk
@@ -76,10 +76,12 @@ $(CLASSLIST_FILE): $(INTERIM_IMAGE_DIR)/bin/java$(EXECUTABLE_SUFFIX) $(CLASSLIST
$(call LogInfo, Generating $(patsubst $(OUTPUTDIR)/%, %, $(JLI_TRACE_FILE)))
$(FIXPATH) $(INTERIM_IMAGE_DIR)/bin/java -XX:DumpLoadedClassList=$@.raw \
$(CLASSLIST_FILE_VM_OPTS) \
+ -Xlog:cds=off \
-cp $(SUPPORT_OUTPUTDIR)/classlist.jar \
build.tools.classlist.HelloClasslist $(LOG_DEBUG)
$(GREP) -v HelloClasslist $@.raw > $@.interim
$(FIXPATH) $(INTERIM_IMAGE_DIR)/bin/java -Xshare:dump \
+ -Xlog:cds=off \
-XX:SharedClassListFile=$@.interim -XX:SharedArchiveFile=$@.jsa \
-Xmx128M -Xms128M $(LOG_INFO)
$(FIXPATH) $(INTERIM_IMAGE_DIR)/bin/java -XX:DumpLoadedClassList=$@.raw.2 \
@@ -87,6 +89,7 @@ $(CLASSLIST_FILE): $(INTERIM_IMAGE_DIR)/bin/java$(EXECUTABLE_SUFFIX) $(CLASSLIST
-Djava.lang.invoke.MethodHandle.TRACE_RESOLVE=true \
$(CLASSLIST_FILE_VM_OPTS) \
--module-path $(SUPPORT_OUTPUTDIR)/classlist.jar \
+ -Xlog:cds=off \
-cp $(SUPPORT_OUTPUTDIR)/classlist.jar \
build.tools.classlist.HelloClasslist \
2> $(LINK_OPT_DIR)/stderr > $(JLI_TRACE_FILE) \
@@ -100,6 +103,7 @@ $(CLASSLIST_FILE): $(INTERIM_IMAGE_DIR)/bin/java$(EXECUTABLE_SUFFIX) $(CLASSLIST
$(GREP) -v HelloClasslist $@.raw.2 > $@.raw.3
$(GREP) -v @cp $@.raw.3 > $@.raw.4
$(FIXPATH) $(INTERIM_IMAGE_DIR)/bin/java \
+ -Xlog:cds=off \
-cp $(SUPPORT_OUTPUTDIR)/classlist.jar \
build.tools.classlist.SortClasslist $@.raw.4 > $@
diff --git a/make/Init.gmk b/make/Init.gmk
index 6da2fb985b62f..5dd1a71dd9a45 100644
--- a/make/Init.gmk
+++ b/make/Init.gmk
@@ -37,11 +37,9 @@ include MakeFileStart.gmk
include $(TOPDIR)/make/InitSupport.gmk
include LogUtils.gmk
-# Force early generation of module-deps.gmk and find-tests.gmk
+# Force early generation of module-deps.gmk
GENERATE_MODULE_DEPS_FILE := true
include Modules.gmk
-GENERATE_FIND_TESTS_FILE := true
-include FindTests.gmk
# Inclusion of this pseudo-target will cause make to execute this file
# serially, regardless of -j.
@@ -139,7 +137,7 @@ main: MAKEOVERRIDES :=
main: $(INIT_TARGETS)
ifneq ($(SEQUENTIAL_TARGETS)$(PARALLEL_TARGETS), )
$(call RotateLogFiles)
- $(PRINTF) "Building $(TARGET_DESCRIPTION)\n" $(BUILD_LOG_PIPE_SIMPLE)
+ $(ECHO) "Building $(TARGET_DESCRIPTION)" $(BUILD_LOG_PIPE_SIMPLE)
ifneq ($(SEQUENTIAL_TARGETS), )
# Don't touch build output dir since we might be cleaning. That
# means no log pipe.
@@ -160,7 +158,8 @@ main: $(INIT_TARGETS)
-f make/Main.gmk $(USER_MAKE_VARS) \
$(PARALLEL_TARGETS) $(COMPARE_BUILD_MAKE) $(BUILD_LOG_PIPE) || \
( exitcode=$$? && \
- $(PRINTF) "\nERROR: Build failed for $(TARGET_DESCRIPTION) (exit code $$exitcode) \n" \
+ $(ECHO) "" $(BUILD_LOG_PIPE_SIMPLE) && \
+ $(ECHO) "ERROR: Build failed for $(TARGET_DESCRIPTION) (exit code $$exitcode)" \
$(BUILD_LOG_PIPE_SIMPLE) && \
cd $(TOPDIR) && $(MAKE) $(MAKE_ARGS) -j 1 -f make/Init.gmk \
on-failure ; \
@@ -172,7 +171,7 @@ main: $(INIT_TARGETS)
if test -f $(MAKESUPPORT_OUTPUTDIR)/exit-with-error ; then \
exit 1 ; \
fi
- $(PRINTF) "Finished building $(TARGET_DESCRIPTION)\n" $(BUILD_LOG_PIPE_SIMPLE)
+ $(ECHO) "Finished building $(TARGET_DESCRIPTION)" $(BUILD_LOG_PIPE_SIMPLE)
$(call ReportProfileTimes)
endif
@@ -183,7 +182,8 @@ on-failure:
$(call PrintFailureReports)
$(call PrintBuildLogFailures)
$(call ReportProfileTimes)
- $(PRINTF) "HELP: Run 'make doctor' to diagnose build problems.\n\n"
+ $(ECHO) "HELP: Run 'make doctor' to diagnose build problems."
+ $(ECHO) ""
ifneq ($(COMPARE_BUILD), )
$(call CleanupCompareBuild)
endif
diff --git a/make/InitSupport.gmk b/make/InitSupport.gmk
index a9af44e4225b1..809d1128692a9 100644
--- a/make/InitSupport.gmk
+++ b/make/InitSupport.gmk
@@ -173,9 +173,10 @@ define PrintFailureReports
$(RM) $(MAKESUPPORT_OUTPUTDIR)/failure-summary.log ; \
$(if $(wildcard $(MAKESUPPORT_OUTPUTDIR)/failure-logs/*.log), \
( \
- $(PRINTF) "\n=== Output from failing command(s) repeated here ===\n" ; \
+ $(ECHO) "" ; \
+ $(ECHO) "=== Output from failing command(s) repeated here ===" ; \
$(foreach logfile, $(sort $(wildcard $(MAKESUPPORT_OUTPUTDIR)/failure-logs/*.log)), \
- $(PRINTF) "* For target $(notdir $(basename $(logfile))):\n" ; \
+ $(ECHO) "* For target $(notdir $(basename $(logfile))):" ; \
$(if $(filter all, $(LOG_REPORT)), \
$(GREP) -v -e "^Note: including file:" < $(logfile) || true ; \
, \
@@ -185,8 +186,9 @@ define PrintFailureReports
fi ; \
) \
) \
- $(PRINTF) "\n* All command lines available in $(MAKESUPPORT_OUTPUTDIR)/failure-logs.\n" ; \
- $(PRINTF) "=== End of repeated output ===\n" ; \
+ $(ECHO) "" ; \
+ $(ECHO) "* All command lines available in $(MAKESUPPORT_OUTPUTDIR)/failure-logs." ; \
+ $(ECHO) "=== End of repeated output ===" ; \
) >> $(MAKESUPPORT_OUTPUTDIR)/failure-summary.log \
) \
)
@@ -195,13 +197,16 @@ endef
define PrintBuildLogFailures
$(if $(filter none, $(LOG_REPORT)), , \
if $(GREP) -q "recipe for target .* failed" $(BUILD_LOG) 2> /dev/null; then \
- $(PRINTF) "\n=== Make failed targets repeated here ===\n" ; \
+ $(ECHO) "" ; \
+ $(ECHO) "=== Make failed targets repeated here ===" ; \
$(GREP) "recipe for target .* failed" $(BUILD_LOG) ; \
- $(PRINTF) "=== End of repeated output ===\n" ; \
- $(PRINTF) "\nHELP: Try searching the build log for the name of the first failed target.\n" ; \
+ $(ECHO) "=== End of repeated output ===" ; \
+ $(ECHO) "" ; \
+ $(ECHO) "HELP: Try searching the build log for the name of the first failed target." ; \
else \
- $(PRINTF) "\nNo indication of failed target found.\n" ; \
- $(PRINTF) "HELP: Try searching the build log for '] Error'.\n" ; \
+ $(ECHO) "" ; \
+ $(ECHO) "No indication of failed target found." ; \
+ $(ECHO) "HELP: Try searching the build log for '] Error'." ; \
fi >> $(MAKESUPPORT_OUTPUTDIR)/failure-summary.log ; \
$(CAT) $(MAKESUPPORT_OUTPUTDIR)/failure-summary.log \
)
diff --git a/make/Main.gmk b/make/Main.gmk
index eda3b79265ad7..3535ad16aae3d 100644
--- a/make/Main.gmk
+++ b/make/Main.gmk
@@ -875,6 +875,12 @@ $(eval $(call SetupTarget, static-libs-graal-bundles, \
DEPS := static-libs-graal-image, \
))
+$(eval $(call SetupTarget, static-jdk-bundles, \
+ MAKEFILE := Bundles, \
+ TARGET := static-jdk-bundles, \
+ DEPS := static-jdk-image, \
+))
+
ifeq ($(JCOV_ENABLED), true)
$(eval $(call SetupTarget, jcov-bundles, \
MAKEFILE := Bundles, \
diff --git a/make/MainSupport.gmk b/make/MainSupport.gmk
index f7ba4de2d53c5..d8dc894c1e983 100644
--- a/make/MainSupport.gmk
+++ b/make/MainSupport.gmk
@@ -58,76 +58,76 @@ endef
define CleanDocs
@$(PRINTF) "Cleaning docs ..."
- @$(PRINTF) "\n" $(LOG_DEBUG)
+ @$(ECHO) "" $(LOG_DEBUG)
$(RM) -r $(SUPPORT_OUTPUTDIR)/docs
$(RM) -r $(SUPPORT_OUTPUTDIR)/javadoc
$(RM) -r $(IMAGES_OUTPUTDIR)/docs
- @$(PRINTF) " done\n"
+ @$(ECHO) " done"
endef
# Cleans the dir given as $1
define CleanDir
- @$(PRINTF) "Cleaning $(strip $1) build artifacts ..."
- @$(PRINTF) "\n" $(LOG_DEBUG)
+ @$(PRINTF) "Cleaning %s build artifacts ..." "$(strip $1)"
+ @$(ECHO) "" $(LOG_DEBUG)
($(CD) $(OUTPUTDIR) && $(RM) -r $1)
- @$(PRINTF) " done\n"
+ @$(ECHO) " done"
endef
define CleanSupportDir
- @$(PRINTF) "Cleaning $(strip $1) build artifacts ..."
- @$(PRINTF) "\n" $(LOG_DEBUG)
+ @$(PRINTF) "Cleaning %s build artifacts ..." "$(strip $1)"
+ @$(ECHO) "" $(LOG_DEBUG)
$(RM) -r $(SUPPORT_OUTPUTDIR)/$(strip $1)
- @$(PRINTF) " done\n"
+ @$(ECHO) " done"
endef
define CleanMakeSupportDir
- @$(PRINTF) "Cleaning $(strip $1) make support artifacts ..."
- @$(PRINTF) "\n" $(LOG_DEBUG)
+ @$(PRINTF) "Cleaning %s make support artifacts ..." "$(strip $1)"
+ @$(ECHO) "" $(LOG_DEBUG)
$(RM) -r $(MAKESUPPORT_OUTPUTDIR)/$(strip $1)
- @$(PRINTF) " done\n"
+ @$(ECHO) " done"
endef
define CleanTest
- @$(PRINTF) "Cleaning test $(strip $1) ..."
- @$(PRINTF) "\n" $(LOG_DEBUG)
+ @$(PRINTF) "Cleaning test %s ..." "$(strip $1)"
+ @$(ECHO) "" $(LOG_DEBUG)
$(RM) -r $(SUPPORT_OUTPUTDIR)/test/$(strip $(subst -,/,$1))
# Remove as much of the test directory structure as is empty
$(RMDIR) -p $(dir $(SUPPORT_OUTPUTDIR)/test/$(strip $(subst -,/,$1))) 2> /dev/null || true
- @$(PRINTF) " done\n"
+ @$(ECHO) " done"
endef
define Clean-gensrc
- @$(PRINTF) "Cleaning gensrc $(if $1,for $(strip $1) )..."
- @$(PRINTF) "\n" $(LOG_DEBUG)
+ @$(PRINTF) "Cleaning gensrc %s..." "$(if $1,for $(strip $1) )"
+ @$(ECHO) "" $(LOG_DEBUG)
$(RM) -r $(SUPPORT_OUTPUTDIR)/gensrc/$(strip $1)
- @$(PRINTF) " done\n"
+ @$(ECHO) " done"
endef
define Clean-java
- @$(PRINTF) "Cleaning java $(if $1,for $(strip $1) )..."
- @$(PRINTF) "\n" $(LOG_DEBUG)
+ @$(PRINTF) "Cleaning java %s..." "$(if $1,for $(strip $1) )"
+ @$(ECHO) "" $(LOG_DEBUG)
$(RM) -r $(JDK_OUTPUTDIR)/modules/$(strip $1)
$(RM) -r $(SUPPORT_OUTPUTDIR)/special_classes/$(strip $1)
- $(PRINTF) " done\n"
- $(PRINTF) "Cleaning headers $(if $1,for $(strip $1)) ..."
+ $(ECHO) " done"
+ $(PRINTF) "Cleaning headers %s..." "$(if $1,for $(strip $1) )"
$(RM) -r $(SUPPORT_OUTPUTDIR)/headers/$(strip $1)
- @$(PRINTF) " done\n"
+ @$(ECHO) " done"
endef
define Clean-native
- @$(PRINTF) "Cleaning native $(if $1,for $(strip $1) )..."
- @$(PRINTF) "\n" $(LOG_DEBUG)
+ @$(PRINTF) "Cleaning native %s..." "$(if $1,for $(strip $1) )"
+ @$(ECHO) "" $(LOG_DEBUG)
$(RM) -r $(SUPPORT_OUTPUTDIR)/native/$(strip $1)
$(RM) -r $(SUPPORT_OUTPUTDIR)/modules_libs/$(strip $1)
$(RM) -r $(SUPPORT_OUTPUTDIR)/modules_cmds/$(strip $1)
- @$(PRINTF) " done\n"
+ @$(ECHO) " done"
endef
define Clean-include
- @$(PRINTF) "Cleaning include $(if $1,for $(strip $1) )..."
- @$(PRINTF) "\n" $(LOG_DEBUG)
+ @$(PRINTF) "Cleaning include %s..." "$(if $1,for $(strip $1) )"
+ @$(ECHO) "" $(LOG_DEBUG)
$(RM) -r $(SUPPORT_OUTPUTDIR)/modules_include/$(strip $1)
- @$(PRINTF) " done\n"
+ @$(ECHO) " done"
endef
define CleanModule
diff --git a/make/PreInit.gmk b/make/PreInit.gmk
index b70e15a3b8c25..bce61ccde5f52 100644
--- a/make/PreInit.gmk
+++ b/make/PreInit.gmk
@@ -48,6 +48,10 @@ include $(TOPDIR)/make/common/LogUtils.gmk
# a configuration. This will define ALL_GLOBAL_TARGETS.
include $(TOPDIR)/make/Global.gmk
+# Targets provided by Init.gmk.
+ALL_INIT_TARGETS := print-modules print-targets print-configuration \
+ print-tests reconfigure pre-compare-build post-compare-build
+
# CALLED_TARGETS is the list of targets that the user provided,
# or "default" if unspecified.
CALLED_TARGETS := $(if $(MAKECMDGOALS), $(MAKECMDGOALS), default)
@@ -93,10 +97,6 @@ ifneq ($(SKIP_SPEC), true)
# This will setup ALL_MAIN_TARGETS.
$(eval $(call DefineMainTargets, FORCE, $(firstword $(SPECS))))
- # Targets provided by Init.gmk.
- ALL_INIT_TARGETS := print-modules print-targets print-configuration \
- print-tests reconfigure pre-compare-build post-compare-build
-
# Separate called targets depending on type.
INIT_TARGETS := $(filter $(ALL_INIT_TARGETS), $(CALLED_SPEC_TARGETS))
MAIN_TARGETS := $(filter $(ALL_MAIN_TARGETS), $(CALLED_SPEC_TARGETS))
@@ -161,19 +161,19 @@ ifneq ($(SKIP_SPEC), true)
( cd $(TOPDIR) && \
$(foreach spec, $(SPECS), \
$(MAKE) $(MAKE_INIT_ARGS) -j 1 -f $(TOPDIR)/make/Init.gmk \
- SPEC=$(spec) $(MAKE_INIT_MAIN_TARGET_ARGS) \
- main && \
+ SPEC=$(spec) TOPDIR_ALT=$(TOPDIR) \
+ $(MAKE_INIT_MAIN_TARGET_ARGS) main && \
$(if $(and $(COMPARE_BUILD), $(PARALLEL_TARGETS)), \
$(MAKE) $(MAKE_INIT_ARGS) -f $(TOPDIR)/make/Init.gmk \
- SPEC=$(spec) \
+ SPEC=$(spec) TOPDIR_ALT=$(TOPDIR) \
COMPARE_BUILD="$(COMPARE_BUILD)" \
pre-compare-build && \
$(MAKE) $(MAKE_INIT_ARGS) -j 1 -f $(TOPDIR)/make/Init.gmk \
- SPEC=$(spec) $(MAKE_INIT_MAIN_TARGET_ARGS) \
+ SPEC=$(spec) TOPDIR_ALT=$(TOPDIR) \
COMPARE_BUILD="$(COMPARE_BUILD):NODRYRUN=true" \
- main && \
+ $(MAKE_INIT_MAIN_TARGET_ARGS) main && \
$(MAKE) $(MAKE_INIT_ARGS) -f $(TOPDIR)/make/Init.gmk \
- SPEC=$(spec) \
+ SPEC=$(spec) TOPDIR_ALT=$(TOPDIR) \
COMPARE_BUILD="$(COMPARE_BUILD):NODRYRUN=true" \
post-compare-build && \
) \
diff --git a/make/PreInitSupport.gmk b/make/PreInitSupport.gmk
index 66bcbd2209cdc..1d3c3ce913519 100644
--- a/make/PreInitSupport.gmk
+++ b/make/PreInitSupport.gmk
@@ -250,13 +250,14 @@ endef
# Param 1: FORCE = force generation of main-targets.gmk or LAZY = do not force.
# Param 2: The SPEC file to use.
define DefineMainTargets
+ SPEC_FILE := $(strip $2)
# We will start by making sure the main-targets.gmk file is removed, if
# make has not been restarted. By the -include, we will trigger the
# rule for generating the file (which is never there since we removed it),
# thus generating it fresh, and make will restart, incrementing the restart
# count.
- main_targets_file := $$(dir $(strip $2))make-support/main-targets.gmk
+ main_targets_file := $$(dir $$(SPEC_FILE))make-support/main-targets.gmk
ifeq ($$(MAKE_RESTARTS), )
# Only do this if make has not been restarted, and if we do not force it.
@@ -267,8 +268,11 @@ define DefineMainTargets
$$(main_targets_file):
@( cd $$(TOPDIR) && \
+ $$(MAKE) $$(MAKE_LOG_FLAGS) -r -R -f $$(TOPDIR)/make/GenerateFindTests.gmk \
+ -I $$(TOPDIR)/make/common SPEC=$$(SPEC_FILE) TOPDIR_ALT=$$(TOPDIR))
+ @( cd $$(TOPDIR) && \
$$(MAKE) $$(MAKE_LOG_FLAGS) -r -R -f $$(TOPDIR)/make/Main.gmk \
- -I $$(TOPDIR)/make/common SPEC=$(strip $2) NO_RECIPES=true \
+ -I $$(TOPDIR)/make/common SPEC=$$(SPEC_FILE) TOPDIR_ALT=$$(TOPDIR) NO_RECIPES=true \
$$(MAKE_LOG_VARS) \
create-main-targets-include )
diff --git a/make/RunTests.gmk b/make/RunTests.gmk
index 4f81c096a3330..3c0280353a9ae 100644
--- a/make/RunTests.gmk
+++ b/make/RunTests.gmk
@@ -75,9 +75,6 @@ endif
# This is the JDK that we will test
JDK_UNDER_TEST := $(JDK_IMAGE_DIR)
-# The JDK used to compile jtreg test code. By default it is the same as
-# JDK_UNDER_TEST.
-JDK_FOR_COMPILE := $(JDK_IMAGE_DIR)
TEST_RESULTS_DIR := $(OUTPUTDIR)/test-results
TEST_SUPPORT_DIR := $(OUTPUTDIR)/test-support
@@ -118,6 +115,7 @@ JTREG_COV_OPTIONS :=
ifeq ($(TEST_OPTS_JCOV), true)
JCOV_OUTPUT_DIR := $(TEST_RESULTS_DIR)/jcov-output
+ JCOV_SUPPORT_DIR := $(TEST_SUPPORT_DIR)/jcov-support
JCOV_GRABBER_LOG := $(JCOV_OUTPUT_DIR)/grabber.log
JCOV_RESULT_FILE := $(JCOV_OUTPUT_DIR)/result.xml
JCOV_REPORT := $(JCOV_OUTPUT_DIR)/report
@@ -537,22 +535,21 @@ define SetupRunGtestTestBody
$$(eval $1_PASSED := $$(shell $$(AWK) '/\[ PASSED \] .* tests?./ \
{ print $$$$4 }' $$($1_RESULT_FILE))) \
$$(if $$($1_PASSED), , $$(eval $1_PASSED := 0)) \
- $$(eval $1_SKIPPED := $$(shell $$(AWK) \
- '/YOU HAVE [0-9]+ DISABLED TEST/ { \
- if (match($$$$0, /[0-9]+/, arr)) { \
- print arr[0]; \
- found=1; \
- } \
- } \
- END { if (!found) print 0; }' \
- $$($1_RESULT_FILE))) \
+ $$(eval $1_GTEST_DISABLED := $$(shell $$(AWK) '/YOU HAVE .* DISABLED TEST/ \
+ { print $$$$3 }' $$($1_RESULT_FILE))) \
+ $$(if $$($1_GTEST_DISABLED), , $$(eval $1_GTEST_DISABLED := 0)) \
+ $$(eval $1_GTEST_SKIPPED := $$(shell $$(AWK) '/\[ SKIPPED \] .* tests?.*/ \
+ { print $$$$4 }' $$($1_RESULT_FILE))) \
+ $$(if $$($1_GTEST_SKIPPED), , $$(eval $1_GTEST_SKIPPED := 0)) \
+ $$(eval $1_SKIPPED := $$(shell \
+ $$(EXPR) $$($1_GTEST_DISABLED) + $$($1_GTEST_SKIPPED))) \
$$(eval $1_FAILED := $$(shell $$(AWK) '/\[ FAILED \] .* tests?, \
listed below/ { print $$$$4 }' $$($1_RESULT_FILE))) \
$$(if $$($1_FAILED), , $$(eval $1_FAILED := 0)) \
$$(eval $1_ERROR := $$(shell \
- $$(EXPR) $$($1_RUN) - $$($1_PASSED) - $$($1_FAILED))) \
+ $$(EXPR) $$($1_RUN) - $$($1_PASSED) - $$($1_FAILED) - $$($1_GTEST_SKIPPED))) \
$$(eval $1_TOTAL := $$(shell \
- $$(EXPR) $$($1_RUN) + $$($1_SKIPPED))) \
+ $$(EXPR) $$($1_RUN) + $$($1_GTEST_DISABLED))) \
, \
$$(eval $1_PASSED := 0) \
$$(eval $1_FAILED := 0) \
@@ -933,6 +930,11 @@ define SetupRunJtregTestBody
JTREG_AUTO_PROBLEM_LISTS += ProblemList-shenandoah.txt
endif
+ ifneq ($$(findstring --enable-preview, $$(JTREG_ALL_OPTIONS)), )
+ JTREG_AUTO_PROBLEM_LISTS += ProblemList-enable-preview.txt
+ endif
+
+
ifneq ($$(JTREG_EXTRA_PROBLEM_LISTS), )
# Accept both absolute paths as well as relative to the current test root.
$1_JTREG_BASIC_OPTIONS += $$(addprefix $$(JTREG_PROBLEM_LIST_PREFIX), $$(wildcard \
@@ -945,6 +947,11 @@ define SetupRunJtregTestBody
$1_JTREG_BASIC_OPTIONS += -e:JIB_HOME=$$(JIB_HOME)
endif
+ ifneq ($$(JDK_FOR_COMPILE), )
+ # Allow overriding the JDK used for compilation from the command line
+ $1_JTREG_BASIC_OPTIONS += -compilejdk:$$(JDK_FOR_COMPILE)
+ endif
+
$1_JTREG_BASIC_OPTIONS += -e:TEST_IMAGE_DIR=$(TEST_IMAGE_DIR)
$1_JTREG_BASIC_OPTIONS += -e:DOCS_JDK_IMAGE_DIR=$$(DOCS_JDK_IMAGE_DIR)
@@ -997,7 +1004,6 @@ define SetupRunJtregTestBody
$$(JTREG_JAVA) $$($1_JTREG_LAUNCHER_OPTIONS) \
-Dprogram=jtreg -jar $$(JT_HOME)/lib/jtreg.jar \
$$($1_JTREG_BASIC_OPTIONS) \
- -compilejdk:$$(JDK_FOR_COMPILE) \
-testjdk:$$(JDK_UNDER_TEST) \
-dir:$$(JTREG_TOPDIR) \
-reportDir:$$($1_TEST_RESULTS_DIR) \
@@ -1016,7 +1022,8 @@ define SetupRunJtregTestBody
$1_COMMAND_LINE := \
for i in {0..$$(JTREG_RETRY_COUNT)}; do \
if [ "$$$$i" != 0 ]; then \
- $$(PRINTF) "\nRetrying Jtreg run. Attempt: $$$$i\n"; \
+ $$(ECHO) ""; \
+ $$(ECHO) "Retrying Jtreg run. Attempt: $$$$i"; \
fi; \
$$($1_COMMAND_LINE); \
if [ "`$$(CAT) $$($1_EXITCODE)`" = "0" ]; then \
@@ -1029,10 +1036,12 @@ define SetupRunJtregTestBody
ifneq ($$(JTREG_REPEAT_COUNT), 0)
$1_COMMAND_LINE := \
for i in {1..$$(JTREG_REPEAT_COUNT)}; do \
- $$(PRINTF) "\nRepeating Jtreg run: $$$$i out of $$(JTREG_REPEAT_COUNT)\n"; \
+ $$(ECHO) ""; \
+ $$(ECHO) "Repeating Jtreg run: $$$$i out of $$(JTREG_REPEAT_COUNT)"; \
$$($1_COMMAND_LINE); \
if [ "`$$(CAT) $$($1_EXITCODE)`" != "0" ]; then \
- $$(PRINTF) "\nFailures detected, no more repeats.\n"; \
+ $$(ECHO) ""; \
+ $$(ECHO) "Failures detected, no more repeats."; \
break; \
fi; \
done
@@ -1335,12 +1344,14 @@ TARGETS += run-all-tests pre-run-test post-run-test run-test-report run-test
ifeq ($(TEST_OPTS_JCOV), true)
+ JCOV_VM_OPTS := -Xmx4g -Djdk.xml.totalEntitySizeLimit=0 -Djdk.xml.maxGeneralEntitySizeLimit=0
+
jcov-do-start-grabber:
$(call MakeDir, $(JCOV_OUTPUT_DIR))
if $(JAVA) -jar $(JCOV_HOME)/lib/jcov.jar GrabberManager -status 1>/dev/null 2>&1 ; then \
$(JAVA) -jar $(JCOV_HOME)/lib/jcov.jar GrabberManager -stop -stoptimeout 3600 ; \
fi
- $(JAVA) -Xmx4g -jar $(JCOV_HOME)/lib/jcov.jar Grabber -v -t \
+ $(JAVA) $(JCOV_VM_OPTS) -jar $(JCOV_HOME)/lib/jcov.jar Grabber -v -t \
$(JCOV_IMAGE_DIR)/template.xml -o $(JCOV_RESULT_FILE) \
1>$(JCOV_GRABBER_LOG) 2>&1 &
@@ -1353,6 +1364,10 @@ ifeq ($(TEST_OPTS_JCOV), true)
$(JAVA) -jar $(JCOV_HOME)/lib/jcov.jar GrabberManager -stop -stoptimeout 3600
JCOV_REPORT_TITLE := JDK code coverage report
+ ifneq ($(JCOV_MODULES), )
+ JCOV_MODULES_FILTER := $(foreach m, $(JCOV_MODULES), -include_module $m)
+ JCOV_REPORT_TITLE += Included modules: $(JCOV_MODULES)
+ endif
ifneq ($(JCOV_FILTERS), )
JCOV_REPORT_TITLE += Code filters: $(JCOV_FILTERS)
endif
@@ -1360,11 +1375,12 @@ ifeq ($(TEST_OPTS_JCOV), true)
jcov-gen-report: jcov-stop-grabber
$(call LogWarn, Generating JCov report ...)
- $(JAVA) -Xmx4g -jar $(JCOV_HOME)/lib/jcov.jar RepGen -sourcepath \
+ $(call ExecuteWithLog, $(JCOV_SUPPORT_DIR)/run-jcov-repgen, \
+ $(JAVA) $(JCOV_VM_OPTS) -jar $(JCOV_HOME)/lib/jcov.jar RepGen -sourcepath \
`$(ECHO) $(TOPDIR)/src/*/share/classes/ | $(TR) ' ' ':'` -fmt html \
- $(JCOV_FILTERS) \
+ $(JCOV_MODULES_FILTER) $(JCOV_FILTERS) \
-mainReportTitle "$(JCOV_REPORT_TITLE)" \
- -o $(JCOV_REPORT) $(JCOV_RESULT_FILE)
+ -o $(JCOV_REPORT) $(JCOV_RESULT_FILE))
TARGETS += jcov-do-start-grabber jcov-start-grabber jcov-stop-grabber \
jcov-gen-report
@@ -1384,7 +1400,7 @@ ifeq ($(TEST_OPTS_JCOV), true)
jcov-gen-diffcoverage: jcov-stop-grabber
$(call LogWarn, Generating diff coverage with changeset $(TEST_OPTS_JCOV_DIFF_CHANGESET) ... )
$(DIFF_COMMAND)
- $(JAVA) -Xmx4g -jar $(JCOV_HOME)/lib/jcov.jar \
+ $(JAVA) $(JCOV_VM_OPTS) -jar $(JCOV_HOME)/lib/jcov.jar \
DiffCoverage -replaceDiff "src/.*/classes/:" -all \
$(JCOV_RESULT_FILE) $(JCOV_SOURCE_DIFF) > \
$(JCOV_DIFF_COVERAGE_REPORT)
diff --git a/make/RunTestsPrebuilt.gmk b/make/RunTestsPrebuilt.gmk
index ea38e73d49cb4..f5fe1d3383023 100644
--- a/make/RunTestsPrebuilt.gmk
+++ b/make/RunTestsPrebuilt.gmk
@@ -217,9 +217,9 @@ else ifeq ($(OPENJDK_TARGET_OS), macosx)
else ifeq ($(OPENJDK_TARGET_OS), windows)
NUM_CORES := $(NUMBER_OF_PROCESSORS)
MEMORY_SIZE := $(shell \
- $(EXPR) `wmic computersystem get totalphysicalmemory -value \
- | $(GREP) = | $(SED) 's/\\r//g' \
- | $(CUT) -d "=" -f 2-` / 1024 / 1024 \
+ $(EXPR) `powershell -Command \
+ "(Get-CimInstance Win32_ComputerSystem).TotalPhysicalMemory" \
+ | $(SED) 's/\\r//g' ` / 1024 / 1024 \
)
endif
ifeq ($(NUM_CORES), )
@@ -298,7 +298,7 @@ test-prebuilt:
@$(RM) -f $(MAKESUPPORT_OUTPUTDIR)/exit-with-error
# We need to fill the FindTest cache before entering RunTests.gmk.
@cd $(TOPDIR)/make && $(MAKE) $(MAKE_ARGS) SPEC=$(SPEC) \
- -f RunTestsPrebuiltFindTests.gmk
+ -f GenerateFindTests.gmk
@cd $(TOPDIR)/make && $(MAKE) $(MAKE_ARGS) -f RunTests.gmk run-test \
TEST="$(TEST)"
diff --git a/make/SourceRevision.gmk b/make/SourceRevision.gmk
index 285aaae17b591..15399527e6a94 100644
--- a/make/SourceRevision.gmk
+++ b/make/SourceRevision.gmk
@@ -55,7 +55,7 @@ ifneq ($(and $(GIT), $(wildcard $(TOPDIR)/.git)), )
SCM_DIR := .git
ID_COMMAND := $(PRINTF) "git:%s%s\n" \
"$$($(GIT) log -n1 --format=%H | cut -c1-12)" \
- "$$(if test -n "$$($(GIT) status --porcelain)"; then printf '+'; fi)"
+ "$$(if test -n "$$($(GIT) status --porcelain)"; then $(PRINTF) '+'; fi)"
endif
ifeq ($(USE_SCM), true)
diff --git a/make/autoconf/basic.m4 b/make/autoconf/basic.m4
index 6daba35547bd6..0e9470a1cff31 100644
--- a/make/autoconf/basic.m4
+++ b/make/autoconf/basic.m4
@@ -134,17 +134,33 @@ AC_DEFUN_ONCE([BASIC_SETUP_BUILD_ENV],
)
AC_SUBST(BUILD_ENV)
+ AC_MSG_CHECKING([for locale to use])
if test "x$LOCALE" != x; then
# Check if we actually have C.UTF-8; if so, use it
if $LOCALE -a | $GREP -q -E "^C\.(utf8|UTF-8)$"; then
LOCALE_USED=C.UTF-8
+ AC_MSG_RESULT([C.UTF-8 (recommended)])
+ elif $LOCALE -a | $GREP -q -E "^en_US\.(utf8|UTF-8)$"; then
+ LOCALE_USED=en_US.UTF-8
+ AC_MSG_RESULT([en_US.UTF-8 (acceptable fallback)])
else
- AC_MSG_WARN([C.UTF-8 locale not found, using C locale])
- LOCALE_USED=C
+ # As a fallback, check if users locale is UTF-8. USER_LOCALE was saved
+ # by the wrapper configure script before autconf messed up LC_ALL.
+ if $ECHO $USER_LOCALE | $GREP -q -E "\.(utf8|UTF-8)$"; then
+ LOCALE_USED=$USER_LOCALE
+ AC_MSG_RESULT([$USER_LOCALE (untested fallback)])
+ AC_MSG_WARN([Could not find C.UTF-8 or en_US.UTF-8 locale. This is not supported, and the build might fail unexpectedly.])
+ else
+ AC_MSG_RESULT([no UTF-8 locale found])
+ AC_MSG_WARN([No UTF-8 locale found. This is not supported. Proceeding with the C locale, but the build might fail unexpectedly.])
+ LOCALE_USED=C
+ fi
+ AC_MSG_NOTICE([The recommended locale is C.UTF-8, but en_US.UTF-8 is also accepted.])
fi
else
- AC_MSG_WARN([locale command not not found, using C locale])
- LOCALE_USED=C
+ LOCALE_USED=C.UTF-8
+ AC_MSG_RESULT([C.UTF-8 (default)])
+ AC_MSG_WARN([locale command not not found, using C.UTF-8 locale])
fi
export LC_ALL=$LOCALE_USED
diff --git a/make/autoconf/basic_tools.m4 b/make/autoconf/basic_tools.m4
index eceb0ae6cc44f..5815c55c962ab 100644
--- a/make/autoconf/basic_tools.m4
+++ b/make/autoconf/basic_tools.m4
@@ -1,5 +1,5 @@
#
-# Copyright (c) 2011, 2024, Oracle and/or its affiliates. All rights reserved.
+# Copyright (c) 2011, 2025, Oracle and/or its affiliates. All rights reserved.
# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
#
# This code is free software; you can redistribute it and/or modify it
@@ -57,6 +57,7 @@ AC_DEFUN_ONCE([BASIC_SETUP_FUNDAMENTAL_TOOLS],
UTIL_LOOKUP_PROGS(LOCALE, locale)
UTIL_LOOKUP_PROGS(PATHTOOL, cygpath wslpath)
UTIL_LOOKUP_PROGS(CMD, cmd.exe, $PATH:/cygdrive/c/windows/system32:/mnt/c/windows/system32:/c/windows/system32)
+ UTIL_LOOKUP_PROGS(LSB_RELEASE, lsb_release)
])
################################################################################
@@ -106,9 +107,6 @@ AC_DEFUN_ONCE([BASIC_SETUP_TOOLS],
UTIL_LOOKUP_PROGS(READLINK, greadlink readlink)
UTIL_LOOKUP_PROGS(WHOAMI, whoami)
- # Tools only needed on some platforms
- UTIL_LOOKUP_PROGS(LSB_RELEASE, lsb_release)
-
# For compare.sh only
UTIL_LOOKUP_PROGS(CMP, cmp)
UTIL_LOOKUP_PROGS(UNIQ, uniq)
@@ -470,7 +468,15 @@ AC_DEFUN_ONCE([BASIC_SETUP_PANDOC],
AC_MSG_CHECKING([if the pandoc smart extension needs to be disabled for markdown])
if $PANDOC --list-extensions | $GREP -q '+smart'; then
AC_MSG_RESULT([yes])
- PANDOC_MARKDOWN_FLAG="markdown-smart"
+ PANDOC_MARKDOWN_FLAG="$PANDOC_MARKDOWN_FLAG-smart"
+ else
+ AC_MSG_RESULT([no])
+ fi
+
+ AC_MSG_CHECKING([if the pandoc tex_math_dollars extension needs to be disabled for markdown])
+ if $PANDOC --list-extensions | $GREP -q '+tex_math_dollars'; then
+ AC_MSG_RESULT([yes])
+ PANDOC_MARKDOWN_FLAG="$PANDOC_MARKDOWN_FLAG-tex_math_dollars"
else
AC_MSG_RESULT([no])
fi
diff --git a/make/autoconf/basic_windows.m4 b/make/autoconf/basic_windows.m4
index fb6fc526bfa21..dac6ec15db6ca 100644
--- a/make/autoconf/basic_windows.m4
+++ b/make/autoconf/basic_windows.m4
@@ -1,5 +1,5 @@
#
-# Copyright (c) 2011, 2022, Oracle and/or its affiliates. All rights reserved.
+# Copyright (c) 2011, 2025, Oracle and/or its affiliates. All rights reserved.
# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
#
# This code is free software; you can redistribute it and/or modify it
@@ -159,7 +159,7 @@ AC_DEFUN([BASIC_SETUP_PATHS_WINDOWS],
else
WINENV_PREFIX_ARG="$WINENV_PREFIX"
fi
- FIXPATH_ARGS="-e $PATHTOOL -p $WINENV_PREFIX_ARG -r ${WINENV_ROOT//\\/\\\\} -t $WINENV_TEMP_DIR -c $CMD -q"
+ FIXPATH_ARGS="-e $PATHTOOL -p $WINENV_PREFIX_ARG -r ${WINENV_ROOT//\\/\\\\} -t $WINENV_TEMP_DIR -c $CMD"
FIXPATH_BASE="$BASH $FIXPATH_DIR/fixpath.sh $FIXPATH_ARGS"
FIXPATH="$FIXPATH_BASE exec"
@@ -215,7 +215,7 @@ AC_DEFUN([BASIC_WINDOWS_FINALIZE_FIXPATH],
if test "x$OPENJDK_BUILD_OS" = xwindows; then
FIXPATH_CMDLINE=". $TOPDIR/make/scripts/fixpath.sh -e $PATHTOOL \
-p $WINENV_PREFIX_ARG -r ${WINENV_ROOT//\\/\\\\} -t $WINENV_TEMP_DIR \
- -c $CMD -q"
+ -c $CMD"
$ECHO > $OUTPUTDIR/fixpath '#!/bin/bash'
$ECHO >> $OUTPUTDIR/fixpath export PATH='"[$]PATH:'$PATH'"'
$ECHO >> $OUTPUTDIR/fixpath $FIXPATH_CMDLINE '"[$]@"'
diff --git a/make/autoconf/boot-jdk.m4 b/make/autoconf/boot-jdk.m4
index d39e6e75a94c1..feb16c7d1791f 100644
--- a/make/autoconf/boot-jdk.m4
+++ b/make/autoconf/boot-jdk.m4
@@ -1,5 +1,5 @@
#
-# Copyright (c) 2011, 2024, Oracle and/or its affiliates. All rights reserved.
+# Copyright (c) 2011, 2025, Oracle and/or its affiliates. All rights reserved.
# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
#
# This code is free software; you can redistribute it and/or modify it
@@ -180,11 +180,13 @@ AC_DEFUN([BOOTJDK_CHECK_JAVA_HOME],
# Test: Is there a java or javac in the PATH, which is a symlink to the JDK?
AC_DEFUN([BOOTJDK_CHECK_JAVA_IN_PATH_IS_SYMLINK],
[
- UTIL_LOOKUP_PROGS(JAVAC_CHECK, javac, , NOFIXPATH)
- UTIL_LOOKUP_PROGS(JAVA_CHECK, java, , NOFIXPATH)
- BINARY="$JAVAC_CHECK"
- if test "x$JAVAC_CHECK" = x; then
- BINARY="$JAVA_CHECK"
+ UTIL_LOOKUP_PROGS(JAVAC_CHECK, javac)
+ UTIL_GET_EXECUTABLE(JAVAC_CHECK) # Will setup JAVAC_CHECK_EXECUTABLE
+ UTIL_LOOKUP_PROGS(JAVA_CHECK, java)
+ UTIL_GET_EXECUTABLE(JAVA_CHECK) # Will setup JAVA_CHECK_EXECUTABLE
+ BINARY="$JAVAC_CHECK_EXECUTABLE"
+ if test "x$JAVAC_CHECK_EXECUTABLE" = x; then
+ BINARY="$JAVA_CHECK_EXECUTABLE"
fi
if test "x$BINARY" != x; then
# So there is a java(c) binary, it might be part of a JDK.
diff --git a/make/autoconf/build-aux/config.guess b/make/autoconf/build-aux/config.guess
index afdf7cb5f9205..ce9fb6cd16f38 100644
--- a/make/autoconf/build-aux/config.guess
+++ b/make/autoconf/build-aux/config.guess
@@ -1,6 +1,6 @@
#!/bin/sh
#
-# Copyright (c) 2012, 2023, Oracle and/or its affiliates. All rights reserved.
+# Copyright (c) 2012, 2025, Oracle and/or its affiliates. All rights reserved.
# Copyright (c) 2021, Azul Systems, Inc. All rights reserved.
# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
#
@@ -53,10 +53,10 @@ if [ "x$OUT" = x ]; then
fi
fi
-# Test and fix cygwin on x86_64
-echo $OUT | grep 86-pc-cygwin > /dev/null 2> /dev/null
+# Test and fix cygwin/msys CPUs
+echo $OUT | grep -e "-pc-cygwin" > /dev/null 2> /dev/null
if test $? != 0; then
- echo $OUT | grep 86-pc-mingw > /dev/null 2> /dev/null
+ echo $OUT | grep -e "-pc-mingw" > /dev/null 2> /dev/null
fi
if test $? = 0; then
case `echo $PROCESSOR_IDENTIFIER | cut -f1 -d' '` in
@@ -64,6 +64,10 @@ if test $? = 0; then
REAL_CPU=x86_64
OUT=$REAL_CPU`echo $OUT | sed -e 's/[^-]*//'`
;;
+ ARMv8)
+ REAL_CPU=aarch64
+ OUT=$REAL_CPU`echo $OUT | sed -e 's/[^-]*//'`
+ ;;
esac
fi
diff --git a/make/autoconf/build-performance.m4 b/make/autoconf/build-performance.m4
index 4414ea0d93c9d..10e86e751998e 100644
--- a/make/autoconf/build-performance.m4
+++ b/make/autoconf/build-performance.m4
@@ -1,5 +1,5 @@
#
-# Copyright (c) 2011, 2024, Oracle and/or its affiliates. All rights reserved.
+# Copyright (c) 2011, 2025, Oracle and/or its affiliates. All rights reserved.
# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
#
# This code is free software; you can redistribute it and/or modify it
@@ -75,7 +75,8 @@ AC_DEFUN([BPERF_CHECK_MEMORY_SIZE],
FOUND_MEM=yes
elif test "x$OPENJDK_BUILD_OS" = xwindows; then
# Windows, but without cygwin
- MEMORY_SIZE=`wmic computersystem get totalphysicalmemory -value | grep = | cut -d "=" -f 2-`
+ MEMORY_SIZE=`powershell -Command \
+ "(Get-CimInstance Win32_ComputerSystem).TotalPhysicalMemory" | $SED 's/\\r//g' `
MEMORY_SIZE=`expr $MEMORY_SIZE / 1024 / 1024`
FOUND_MEM=yes
fi
diff --git a/make/autoconf/configure b/make/autoconf/configure
index 6fa0aacfbc907..443a37bae77db 100644
--- a/make/autoconf/configure
+++ b/make/autoconf/configure
@@ -49,7 +49,9 @@ fi
export CONFIG_SHELL=$BASH
export _as_can_reexec=no
-# Make sure all shell commands are executed with the C locale
+# Save user's current locale, but make sure all future shell commands are
+# executed with the C locale
+export USER_LOCALE=$LC_ALL
export LC_ALL=C
if test "x$CUSTOM_CONFIG_DIR" != x; then
diff --git a/make/autoconf/configure.ac b/make/autoconf/configure.ac
index fffe17daad875..e05b5ae3b90f0 100644
--- a/make/autoconf/configure.ac
+++ b/make/autoconf/configure.ac
@@ -261,6 +261,7 @@ JDKOPT_ENABLE_DISABLE_CDS_ARCHIVE_COH
JDKOPT_ENABLE_DISABLE_COMPATIBLE_CDS_ALIGNMENT
JDKOPT_SETUP_MACOSX_SIGNING
JDKOPT_SETUP_SIGNING_HOOK
+JDKOPT_SETUP_JAVA_WARNINGS
################################################################################
#
diff --git a/make/autoconf/flags-cflags.m4 b/make/autoconf/flags-cflags.m4
index 374ee1851a580..eb0e5e20e4c5f 100644
--- a/make/autoconf/flags-cflags.m4
+++ b/make/autoconf/flags-cflags.m4
@@ -573,12 +573,20 @@ AC_DEFUN([FLAGS_SETUP_CFLAGS_HELPER],
TOOLCHAIN_CFLAGS_JDK="$TOOLCHAIN_CFLAGS_JDK -fvisibility=hidden -fstack-protector"
elif test "x$TOOLCHAIN_TYPE" = xmicrosoft; then
- # The -utf-8 option sets source and execution character sets to UTF-8 to enable correct
- # compilation of all source files regardless of the active code page on Windows.
- TOOLCHAIN_CFLAGS_JVM="-nologo -MD -Zc:preprocessor -Zc:inline -Zc:throwingNew -permissive- -utf-8 -MP"
- TOOLCHAIN_CFLAGS_JDK="-nologo -MD -Zc:preprocessor -Zc:inline -Zc:throwingNew -permissive- -utf-8 -Zc:wchar_t-"
+ TOOLCHAIN_CFLAGS_JVM="-nologo -MD -Zc:preprocessor -Zc:inline -Zc:throwingNew -permissive- -MP"
+ TOOLCHAIN_CFLAGS_JDK="-nologo -MD -Zc:preprocessor -Zc:inline -Zc:throwingNew -permissive- -Zc:wchar_t-"
fi
+ # Set character encoding in source
+ if test "x$TOOLCHAIN_TYPE" = xgcc || test "x$TOOLCHAIN_TYPE" = xclang; then
+ CHARSET_CFLAGS="-finput-charset=utf-8"
+ elif test "x$TOOLCHAIN_TYPE" = xmicrosoft; then
+ # The -utf-8 option sets both source and execution character sets
+ CHARSET_CFLAGS="-utf-8 -validate-charset"
+ fi
+ TOOLCHAIN_CFLAGS_JVM="$TOOLCHAIN_CFLAGS_JVM $CHARSET_CFLAGS"
+ TOOLCHAIN_CFLAGS_JDK="$TOOLCHAIN_CFLAGS_JDK $CHARSET_CFLAGS"
+
# CFLAGS C language level for JDK sources (hotspot only uses C++)
if test "x$TOOLCHAIN_TYPE" = xgcc || test "x$TOOLCHAIN_TYPE" = xclang; then
LANGSTD_CFLAGS="-std=c11"
@@ -721,7 +729,7 @@ AC_DEFUN([FLAGS_SETUP_CFLAGS_CPU_DEP],
$1_CFLAGS_CPU="-fsigned-char -Wno-psabi $ARM_ARCH_TYPE_FLAGS $ARM_FLOAT_TYPE_FLAGS -DJDK_ARCH_ABI_PROP_NAME='\"\$(JDK_ARCH_ABI_PROP_NAME)\"'"
$1_CFLAGS_CPU_JVM="-DARM"
elif test "x$FLAGS_CPU_ARCH" = xppc; then
- $1_CFLAGS_CPU_JVM="-minsert-sched-nops=regroup_exact -mno-multiple -mno-string"
+ $1_CFLAGS_CPU_JVM="-mno-multiple -mno-string"
if test "x$FLAGS_CPU" = xppc64; then
# -mminimal-toc fixes `relocation truncated to fit' error for gcc 4.1.
# Use ppc64 instructions, but schedule for power5
diff --git a/make/autoconf/help.m4 b/make/autoconf/help.m4
index 400acf11a6329..d8c0b2ffaeffb 100644
--- a/make/autoconf/help.m4
+++ b/make/autoconf/help.m4
@@ -228,19 +228,19 @@ AC_DEFUN_ONCE([HELP_PRINT_ADDITIONAL_HELP_AND_EXIT],
if test "x$CONFIGURE_PRINT_ADDITIONAL_HELP" != x; then
# Print available toolchains
- $PRINTF "The following toolchains are valid as arguments to --with-toolchain-type.\n"
- $PRINTF "Which are available to use depends on the build platform.\n"
+ $ECHO "The following toolchains are valid as arguments to --with-toolchain-type."
+ $ECHO "Which are available to use depends on the build platform."
for toolchain in $VALID_TOOLCHAINS_all; do
# Use indirect variable referencing
toolchain_var_name=TOOLCHAIN_DESCRIPTION_$toolchain
TOOLCHAIN_DESCRIPTION=${!toolchain_var_name}
$PRINTF " %-22s %s\n" $toolchain "$TOOLCHAIN_DESCRIPTION"
done
- $PRINTF "\n"
+ $ECHO ""
# Print available JVM features
- $PRINTF "The following JVM features are valid as arguments to --with-jvm-features.\n"
- $PRINTF "Which are available to use depends on the environment and JVM variant.\n"
+ $ECHO "The following JVM features are valid as arguments to --with-jvm-features."
+ $ECHO "Which are available to use depends on the environment and JVM variant."
m4_foreach(FEATURE, m4_split(jvm_features_valid), [
# Create an m4 variable containing the description for FEATURE.
m4_define(FEATURE_DESCRIPTION, [jvm_feature_desc_]m4_translit(FEATURE, -, _))
@@ -257,115 +257,117 @@ AC_DEFUN_ONCE([HELP_PRINT_SUMMARY_AND_WARNINGS],
[
# Finally output some useful information to the user
- printf "\n"
- printf "====================================================\n"
+ $ECHO ""
+ $ECHO "===================================================="
if test "x$no_create" != "xyes"; then
if test "x$IS_RECONFIGURE" != "xyes"; then
- printf "A new configuration has been successfully created in\n%s\n" "$OUTPUTDIR"
+ $ECHO "A new configuration has been successfully created in"
+ $ECHO "$OUTPUTDIR"
else
- printf "The existing configuration has been successfully updated in\n%s\n" "$OUTPUTDIR"
+ $ECHO "The existing configuration has been successfully updated in"
+ $ECHO "$OUTPUTDIR"
fi
else
if test "x$IS_RECONFIGURE" != "xyes"; then
- printf "A configuration has been successfully checked but not created\n"
+ $ECHO "A configuration has been successfully checked but not created"
else
- printf "The existing configuration has been successfully checked in\n%s\n" "$OUTPUTDIR"
+ $ECHO "The existing configuration has been successfully checked in"
+ $ECHO "$OUTPUTDIR"
fi
fi
if test "x$CONFIGURE_COMMAND_LINE" != x; then
- printf "using configure arguments '$CONFIGURE_COMMAND_LINE'.\n"
+ $ECHO "using configure arguments '$CONFIGURE_COMMAND_LINE'."
else
- printf "using default settings.\n"
+ $ECHO "using default settings."
fi
if test "x$REAL_CONFIGURE_COMMAND_EXEC_FULL" != x; then
- printf "\n"
- printf "The original configure invocation was '$REAL_CONFIGURE_COMMAND_EXEC_SHORT $REAL_CONFIGURE_COMMAND_LINE'.\n"
+ $ECHO ""
+ $ECHO "The original configure invocation was '$REAL_CONFIGURE_COMMAND_EXEC_SHORT $REAL_CONFIGURE_COMMAND_LINE'."
fi
- printf "\n"
- printf "Configuration summary:\n"
- printf "* Name: $CONF_NAME\n"
- printf "* Debug level: $DEBUG_LEVEL\n"
- printf "* HS debug level: $HOTSPOT_DEBUG_LEVEL\n"
- printf "* JVM variants: $JVM_VARIANTS\n"
- printf "* JVM features: "
+ $ECHO ""
+ $ECHO "Configuration summary:"
+ $ECHO "* Name: $CONF_NAME"
+ $ECHO "* Debug level: $DEBUG_LEVEL"
+ $ECHO "* HS debug level: $HOTSPOT_DEBUG_LEVEL"
+ $ECHO "* JVM variants: $JVM_VARIANTS"
+ $PRINTF "* JVM features: "
for variant in $JVM_VARIANTS; do
features_var_name=JVM_FEATURES_$variant
JVM_FEATURES_FOR_VARIANT=${!features_var_name}
- printf "$variant: \'$JVM_FEATURES_FOR_VARIANT\' "
+ $PRINTF "%s: \'%s\' " "$variant" "$JVM_FEATURES_FOR_VARIANT"
done
- printf "\n"
+ $ECHO ""
- printf "* OpenJDK target: OS: $OPENJDK_TARGET_OS, CPU architecture: $OPENJDK_TARGET_CPU_ARCH, address length: $OPENJDK_TARGET_CPU_BITS\n"
- printf "* Version string: $VERSION_STRING ($VERSION_SHORT)\n"
+ $ECHO "* OpenJDK target: OS: $OPENJDK_TARGET_OS, CPU architecture: $OPENJDK_TARGET_CPU_ARCH, address length: $OPENJDK_TARGET_CPU_BITS"
+ $ECHO "* Version string: $VERSION_STRING ($VERSION_SHORT)"
if test "x$SOURCE_DATE" != xupdated; then
source_date_info="$SOURCE_DATE ($SOURCE_DATE_ISO_8601)"
else
source_date_info="Determined at build time"
fi
- printf "* Source date: $source_date_info\n"
+ $ECHO "* Source date: $source_date_info"
- printf "\n"
- printf "Tools summary:\n"
+ $ECHO ""
+ $ECHO "Tools summary:"
if test "x$OPENJDK_BUILD_OS" = "xwindows"; then
- printf "* Environment: %s version %s; windows version %s; prefix \"%s\"; root \"%s\"\n" \
- "$WINENV_VENDOR" "$WINENV_VERSION" "$WINDOWS_VERSION" "$WINENV_PREFIX" "$WINENV_ROOT"
+ $ECHO "* Environment: $WINENV_VENDOR version $WINENV_VERSION; windows version $WINDOWS_VERSION; prefix \"$WINENV_PREFIX\"; root \"$WINENV_ROOT\""
fi
- printf "* Boot JDK: $BOOT_JDK_VERSION (at $BOOT_JDK)\n"
- printf "* Toolchain: $TOOLCHAIN_TYPE ($TOOLCHAIN_DESCRIPTION)\n"
+ $ECHO "* Boot JDK: $BOOT_JDK_VERSION (at $BOOT_JDK)"
+ $ECHO "* Toolchain: $TOOLCHAIN_TYPE ($TOOLCHAIN_DESCRIPTION)"
if test "x$DEVKIT_NAME" != x; then
- printf "* Devkit: $DEVKIT_NAME ($DEVKIT_ROOT)\n"
+ $ECHO "* Devkit: $DEVKIT_NAME ($DEVKIT_ROOT)"
elif test "x$DEVKIT_ROOT" != x; then
- printf "* Devkit: $DEVKIT_ROOT\n"
+ $ECHO "* Devkit: $DEVKIT_ROOT"
elif test "x$SYSROOT" != x; then
- printf "* Sysroot: $SYSROOT\n"
+ $ECHO "* Sysroot: $SYSROOT"
fi
- printf "* C Compiler: Version $CC_VERSION_NUMBER (at ${CC#"$FIXPATH "})\n"
- printf "* C++ Compiler: Version $CXX_VERSION_NUMBER (at ${CXX#"$FIXPATH "})\n"
+ $ECHO "* C Compiler: Version $CC_VERSION_NUMBER (at ${CC#"$FIXPATH "})"
+ $ECHO "* C++ Compiler: Version $CXX_VERSION_NUMBER (at ${CXX#"$FIXPATH "})"
- printf "\n"
- printf "Build performance summary:\n"
- printf "* Build jobs: $JOBS\n"
- printf "* Memory limit: $MEMORY_SIZE MB\n"
+ $ECHO ""
+ $ECHO "Build performance summary:"
+ $ECHO "* Build jobs: $JOBS"
+ $ECHO "* Memory limit: $MEMORY_SIZE MB"
if test "x$CCACHE_STATUS" != "x"; then
- printf "* ccache status: $CCACHE_STATUS\n"
+ $ECHO "* ccache status: $CCACHE_STATUS"
fi
- printf "\n"
+ $ECHO ""
if test "x$BUILDING_MULTIPLE_JVM_VARIANTS" = "xtrue"; then
- printf "NOTE: You have requested to build more than one version of the JVM, which\n"
- printf "will result in longer build times.\n"
- printf "\n"
+ $ECHO "NOTE: You have requested to build more than one version of the JVM, which"
+ $ECHO "will result in longer build times."
+ $ECHO ""
fi
if test "x$OUTPUT_DIR_IS_LOCAL" != "xyes"; then
- printf "WARNING: Your build output directory is not on a local disk.\n"
- printf "This will severely degrade build performance!\n"
- printf "It is recommended that you create an output directory on a local disk,\n"
- printf "and run the configure script again from that directory.\n"
- printf "\n"
+ $ECHO "WARNING: Your build output directory is not on a local disk."
+ $ECHO "This will severely degrade build performance!"
+ $ECHO "It is recommended that you create an output directory on a local disk,"
+ $ECHO "and run the configure script again from that directory."
+ $ECHO ""
fi
if test "x$IS_RECONFIGURE" = "xyes" && test "x$no_create" != "xyes"; then
- printf "WARNING: The result of this configuration has overridden an older\n"
- printf "configuration. You *should* run 'make clean' to make sure you get a\n"
- printf "proper build. Failure to do so might result in strange build problems.\n"
- printf "\n"
+ $ECHO "WARNING: The result of this configuration has overridden an older"
+ $ECHO "configuration. You *should* run 'make clean' to make sure you get a"
+ $ECHO "proper build. Failure to do so might result in strange build problems."
+ $ECHO ""
fi
if test "x$IS_RECONFIGURE" != "xyes" && test "x$no_create" = "xyes"; then
- printf "WARNING: The result of this configuration was not saved.\n"
- printf "You should run without '--no-create | -n' to create the configuration.\n"
- printf "\n"
+ $ECHO "WARNING: The result of this configuration was not saved."
+ $ECHO "You should run without '--no-create | -n' to create the configuration."
+ $ECHO ""
fi
if test "x$UNSUPPORTED_TOOLCHAIN_VERSION" = "xyes"; then
- printf "WARNING: The toolchain version used is known to have issues. Please\n"
- printf "consider using a supported version unless you know what you are doing.\n"
- printf "\n"
+ $ECHO "WARNING: The toolchain version used is known to have issues. Please"
+ $ECHO "consider using a supported version unless you know what you are doing."
+ $ECHO ""
fi
])
@@ -381,10 +383,10 @@ AC_DEFUN_ONCE([HELP_REPEAT_WARNINGS],
if test -e "$CONFIG_LOG_PATH/config.log"; then
$GREP '^configure:.*: WARNING:' "$CONFIG_LOG_PATH/config.log" > /dev/null 2>&1
if test $? -eq 0; then
- printf "The following warnings were produced. Repeated here for convenience:\n"
+ $ECHO "The following warnings were produced. Repeated here for convenience:"
# We must quote sed expression (using []) to stop m4 from eating the [].
$GREP '^configure:.*: WARNING:' "$CONFIG_LOG_PATH/config.log" | $SED -e [ 's/^configure:[0-9]*: //' ]
- printf "\n"
+ $ECHO ""
fi
fi
])
diff --git a/make/autoconf/jdk-options.m4 b/make/autoconf/jdk-options.m4
index 72e731e7ffc0c..289ed935fdfed 100644
--- a/make/autoconf/jdk-options.m4
+++ b/make/autoconf/jdk-options.m4
@@ -405,10 +405,19 @@ AC_DEFUN_ONCE([JDKOPT_SETUP_CODE_COVERAGE],
JCOV_FILTERS="$with_jcov_filters"
fi
fi
+
+ UTIL_ARG_WITH(NAME: jcov-modules, TYPE: string,
+ DEFAULT: [], RESULT: JCOV_MODULES_COMMMA_SEPARATED,
+ DESC: [which modules to include in jcov (comma-separated)],
+ OPTIONAL: true)
+
+ # Replace "," with " ".
+ JCOV_MODULES=${JCOV_MODULES_COMMMA_SEPARATED//,/ }
AC_SUBST(JCOV_ENABLED)
AC_SUBST(JCOV_HOME)
AC_SUBST(JCOV_INPUT_JDK)
AC_SUBST(JCOV_FILTERS)
+ AC_SUBST(JCOV_MODULES)
])
################################################################################
@@ -520,8 +529,21 @@ AC_DEFUN_ONCE([JDKOPT_SETUP_UNDEFINED_BEHAVIOR_SANITIZER],
# Silence them for now.
UBSAN_CHECKS="-fsanitize=undefined -fsanitize=float-divide-by-zero -fno-sanitize=shift-base -fno-sanitize=alignment \
$ADDITIONAL_UBSAN_CHECKS"
- UBSAN_CFLAGS="$UBSAN_CHECKS -Wno-stringop-truncation -Wno-format-overflow -Wno-array-bounds -Wno-stringop-overflow -fno-omit-frame-pointer -DUNDEFINED_BEHAVIOR_SANITIZER"
+ UBSAN_CFLAGS="$UBSAN_CHECKS -Wno-array-bounds -fno-omit-frame-pointer -DUNDEFINED_BEHAVIOR_SANITIZER"
+ if test "x$TOOLCHAIN_TYPE" = "xgcc"; then
+ UBSAN_CFLAGS="$UBSAN_CFLAGS -Wno-format-overflow -Wno-stringop-overflow -Wno-stringop-truncation"
+ fi
UBSAN_LDFLAGS="$UBSAN_CHECKS"
+ # On AIX, the llvm_symbolizer is not found out of the box, so we have to provide the
+ # full qualified llvm_symbolizer path in the __ubsan_default_options() function in
+ # make/data/ubsan/ubsan_default_options.c. To get it there we compile our sources
+ # with an additional define LLVM_SYMBOLIZER, which we set here.
+ # To calculate the correct llvm_symbolizer path we can use the location of the compiler, because
+ # their relation is fixed.
+ if test "x$TOOLCHAIN_TYPE" = "xclang" && test "x$OPENJDK_TARGET_OS" = "xaix"; then
+ UBSAN_CFLAGS="$UBSAN_CFLAGS -fno-sanitize=function,vptr -DLLVM_SYMBOLIZER=$(dirname $(dirname $CC))/tools/ibm-llvm-symbolizer"
+ UBSAN_LDFLAGS="$UBSAN_LDFLAGS -fno-sanitize=function,vptr -Wl,-bbigtoc"
+ fi
UTIL_ARG_ENABLE(NAME: ubsan, DEFAULT: false, RESULT: UBSAN_ENABLED,
DESC: [enable UndefinedBehaviorSanitizer],
CHECK_AVAILABLE: [
@@ -988,6 +1010,18 @@ AC_DEFUN([JDKOPT_SETUP_SIGNING_HOOK],
AC_SUBST(SIGNING_HOOK)
])
+################################################################################
+#
+# Setup how javac should handle warnings.
+#
+AC_DEFUN([JDKOPT_SETUP_JAVA_WARNINGS],
+[
+ UTIL_ARG_ENABLE(NAME: java-warnings-as-errors, DEFAULT: true,
+ RESULT: JAVA_WARNINGS_AS_ERRORS,
+ DESC: [consider java warnings to be an error])
+ AC_SUBST(JAVA_WARNINGS_AS_ERRORS)
+])
+
################################################################################
#
# fallback linker
diff --git a/make/autoconf/lib-bundled.m4 b/make/autoconf/lib-bundled.m4
index 091f01cadb5d7..3246697663cef 100644
--- a/make/autoconf/lib-bundled.m4
+++ b/make/autoconf/lib-bundled.m4
@@ -62,19 +62,29 @@ AC_DEFUN_ONCE([LIB_SETUP_LIBJPEG],
if test "x${with_libjpeg}" = "xbundled"; then
USE_EXTERNAL_LIBJPEG=false
+ LIBJPEG_CFLAGS=""
+ LIBJPEG_LIBS=""
elif test "x${with_libjpeg}" = "xsystem"; then
- AC_CHECK_HEADER(jpeglib.h, [],
- [ AC_MSG_ERROR([--with-libjpeg=system specified, but jpeglib.h not found!])])
- AC_CHECK_LIB(jpeg, jpeg_CreateDecompress, [],
- [ AC_MSG_ERROR([--with-libjpeg=system specified, but no libjpeg found])])
-
- USE_EXTERNAL_LIBJPEG=true
- LIBJPEG_LIBS="-ljpeg"
+ PKG_CHECK_MODULES(LIBJPEG, libjpeg, [LIBJPEG_FOUND=yes], [LIBJPEG_FOUND=no])
+ if test "x${LIBJPEG_FOUND}" = "xyes"; then
+ # PKG_CHECK_MODULES will set LIBJPEG_CFLAGS and LIBJPEG_LIBS
+ USE_EXTERNAL_LIBJPEG=true
+ else
+ AC_CHECK_HEADER(jpeglib.h, [],
+ [ AC_MSG_ERROR([--with-libjpeg=system specified, but jpeglib.h not found!])])
+ AC_CHECK_LIB(jpeg, jpeg_CreateDecompress, [],
+ [ AC_MSG_ERROR([--with-libjpeg=system specified, but no libjpeg found])])
+
+ USE_EXTERNAL_LIBJPEG=true
+ LIBJPEG_CFLAGS=""
+ LIBJPEG_LIBS="-ljpeg"
+ fi
else
AC_MSG_ERROR([Invalid use of --with-libjpeg: ${with_libjpeg}, use 'system' or 'bundled'])
fi
AC_SUBST(USE_EXTERNAL_LIBJPEG)
+ AC_SUBST(LIBJPEG_CFLAGS)
AC_SUBST(LIBJPEG_LIBS)
])
@@ -85,6 +95,10 @@ AC_DEFUN_ONCE([LIB_SETUP_GIFLIB],
[
AC_ARG_WITH(giflib, [AS_HELP_STRING([--with-giflib],
[use giflib from build system or OpenJDK source (system, bundled) @<:@bundled@:>@])])
+ AC_ARG_WITH(giflib-include, [AS_HELP_STRING([--with-giflib-include],
+ [specify directory for the system giflib include files])])
+ AC_ARG_WITH(giflib-lib, [AS_HELP_STRING([--with-giflib-lib],
+ [specify directory for the system giflib library])])
AC_MSG_CHECKING([for which giflib to use])
# default is bundled
@@ -97,11 +111,40 @@ AC_DEFUN_ONCE([LIB_SETUP_GIFLIB],
if test "x${with_giflib}" = "xbundled"; then
USE_EXTERNAL_LIBGIF=false
+ GIFLIB_CFLAGS=""
+ GIFLIB_LIBS=""
elif test "x${with_giflib}" = "xsystem"; then
- AC_CHECK_HEADER(gif_lib.h, [],
- [ AC_MSG_ERROR([--with-giflib=system specified, but gif_lib.h not found!])])
- AC_CHECK_LIB(gif, DGifGetCode, [],
- [ AC_MSG_ERROR([--with-giflib=system specified, but no giflib found!])])
+ GIFLIB_H_FOUND=no
+ if test "x${with_giflib_include}" != x; then
+ GIFLIB_CFLAGS="-I${with_giflib_include}"
+ GIFLIB_H_FOUND=yes
+ fi
+ if test "x$GIFLIB_H_FOUND" = xno; then
+ AC_CHECK_HEADER(gif_lib.h,
+ [
+ GIFLIB_CFLAGS=""
+ GIFLIB_H_FOUND=yes
+ ])
+ fi
+ if test "x$GIFLIB_H_FOUND" = xno; then
+ AC_MSG_ERROR([--with-giflib=system specified, but gif_lib.h not found!])
+ fi
+
+ GIFLIB_LIB_FOUND=no
+ if test "x${with_giflib_lib}" != x; then
+ GIFLIB_LIBS="-L${with_giflib_lib} -lgif"
+ GIFLIB_LIB_FOUND=yes
+ fi
+ if test "x$GIFLIB_LIB_FOUND" = xno; then
+ AC_CHECK_LIB(gif, DGifGetCode,
+ [
+ GIFLIB_LIBS="-lgif"
+ GIFLIB_LIB_FOUND=yes
+ ])
+ fi
+ if test "x$GIFLIB_LIB_FOUND" = xno; then
+ AC_MSG_ERROR([--with-giflib=system specified, but no giflib found!])
+ fi
USE_EXTERNAL_LIBGIF=true
GIFLIB_LIBS=-lgif
@@ -110,6 +153,7 @@ AC_DEFUN_ONCE([LIB_SETUP_GIFLIB],
fi
AC_SUBST(USE_EXTERNAL_LIBGIF)
+ AC_SUBST(GIFLIB_CFLAGS)
AC_SUBST(GIFLIB_LIBS)
])
diff --git a/make/autoconf/libraries.m4 b/make/autoconf/libraries.m4
index b946be97d96ac..bf697928f1bb8 100644
--- a/make/autoconf/libraries.m4
+++ b/make/autoconf/libraries.m4
@@ -1,5 +1,5 @@
#
-# Copyright (c) 2011, 2024, Oracle and/or its affiliates. All rights reserved.
+# Copyright (c) 2011, 2025, Oracle and/or its affiliates. All rights reserved.
# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
#
# This code is free software; you can redistribute it and/or modify it
@@ -98,13 +98,7 @@ AC_DEFUN([LIB_SETUP_JVM_LIBS],
# 32-bit platforms needs fallback library for 8-byte atomic ops on Zero
if HOTSPOT_CHECK_JVM_VARIANT(zero); then
if test "x$OPENJDK_$1_OS" = xlinux &&
- (test "x$OPENJDK_$1_CPU" = xarm ||
- test "x$OPENJDK_$1_CPU" = xm68k ||
- test "x$OPENJDK_$1_CPU" = xmips ||
- test "x$OPENJDK_$1_CPU" = xmipsel ||
- test "x$OPENJDK_$1_CPU" = xppc ||
- test "x$OPENJDK_$1_CPU" = xsh ||
- test "x$OPENJDK_$1_CPU" = xriscv32); then
+ test "x$OPENJDK_TARGET_CPU_BITS" = "x32"; then
BASIC_JVM_LIBS_$1="$BASIC_JVM_LIBS_$1 -latomic"
fi
fi
diff --git a/make/autoconf/spec.gmk.template b/make/autoconf/spec.gmk.template
index 80c6dfc2ba223..e720916d88a43 100644
--- a/make/autoconf/spec.gmk.template
+++ b/make/autoconf/spec.gmk.template
@@ -1,5 +1,5 @@
#
-# Copyright (c) 2011, 2024, Oracle and/or its affiliates. All rights reserved.
+# Copyright (c) 2011, 2025, Oracle and/or its affiliates. All rights reserved.
# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
#
# This code is free software; you can redistribute it and/or modify it
@@ -454,6 +454,7 @@ JCOV_ENABLED := @JCOV_ENABLED@
JCOV_HOME := @JCOV_HOME@
JCOV_INPUT_JDK := @JCOV_INPUT_JDK@
JCOV_FILTERS := @JCOV_FILTERS@
+JCOV_MODULES := @JCOV_MODULES@
# AddressSanitizer
ASAN_ENABLED := @ASAN_ENABLED@
@@ -517,6 +518,7 @@ DISABLED_WARNINGS_CXX := @DISABLED_WARNINGS_CXX@
# A global flag (true or false) determining if native warnings are considered errors.
WARNINGS_AS_ERRORS := @WARNINGS_AS_ERRORS@
+JAVA_WARNINGS_AS_ERRORS := @JAVA_WARNINGS_AS_ERRORS@
CFLAGS_CCACHE := @CFLAGS_CCACHE@
ADLC_LANGSTD_CXXFLAGS := @ADLC_LANGSTD_CXXFLAGS@
@@ -800,8 +802,10 @@ TAR_SUPPORTS_TRANSFORM := @TAR_SUPPORTS_TRANSFORM@
# Build setup
USE_EXTERNAL_LIBJPEG := @USE_EXTERNAL_LIBJPEG@
+LIBJPEG_CFLAGS := @LIBJPEG_CFLAGS@
LIBJPEG_LIBS := @LIBJPEG_LIBS@
USE_EXTERNAL_LIBGIF := @USE_EXTERNAL_LIBGIF@
+GIFLIB_CFLAGS := @GIFLIB_CFLAGS@
GIFLIB_LIBS := @GIFLIB_LIBS@
USE_EXTERNAL_LIBZ := @USE_EXTERNAL_LIBZ@
LIBZ_CFLAGS := @LIBZ_CFLAGS@
@@ -843,10 +847,12 @@ SVE_CFLAGS := @SVE_CFLAGS@
JDK_IMAGE_SUBDIR := jdk
JRE_IMAGE_SUBDIR := jre
JCOV_IMAGE_SUBDIR := jdk-jcov
+STATIC_JDK_IMAGE_SUBDIR := static-jdk
# Colon left out to be able to override output dir for bootcycle-images
JDK_IMAGE_DIR = $(IMAGES_OUTPUTDIR)/$(JDK_IMAGE_SUBDIR)
JRE_IMAGE_DIR = $(IMAGES_OUTPUTDIR)/$(JRE_IMAGE_SUBDIR)
+STATIC_JDK_IMAGE_DIR = $(IMAGES_OUTPUTDIR)/$(STATIC_JDK_IMAGE_SUBDIR)
JCOV_IMAGE_DIR = $(IMAGES_OUTPUTDIR)/$(JCOV_IMAGE_SUBDIR)
# Test image, as above
@@ -926,6 +932,7 @@ DOCS_JAVASE_BUNDLE_NAME := javase-$(BASE_NAME)_doc-api-spec$(DEBUG_PART).tar.gz
DOCS_REFERENCE_BUNDLE_NAME := jdk-reference-$(BASE_NAME)_doc-api-spec$(DEBUG_PART).tar.gz
STATIC_LIBS_BUNDLE_NAME := jdk-$(BASE_NAME)_bin-static-libs$(DEBUG_PART).tar.gz
STATIC_LIBS_GRAAL_BUNDLE_NAME := jdk-$(BASE_NAME)_bin-static-libs-graal$(DEBUG_PART).tar.gz
+STATIC_JDK_BUNDLE_NAME := static-jdk-$(BASE_NAME)_bin$(DEBUG_PART).$(JDK_BUNDLE_EXTENSION)
JCOV_BUNDLE_NAME := jdk-jcov-$(BASE_NAME)_bin$(DEBUG_PART).$(JDK_BUNDLE_EXTENSION)
JDK_BUNDLE := $(BUNDLES_OUTPUTDIR)/$(JDK_BUNDLE_NAME)
@@ -936,6 +943,7 @@ TEST_BUNDLE := $(BUNDLES_OUTPUTDIR)/$(TEST_BUNDLE_NAME)
DOCS_JDK_BUNDLE := $(BUNDLES_OUTPUTDIR)/$(DOCS_JDK_BUNDLE_NAME)
DOCS_JAVASE_BUNDLE := $(BUNDLES_OUTPUTDIR)/$(DOCS_JAVASE_BUNDLE_NAME)
DOCS_REFERENCE_BUNDLE := $(BUNDLES_OUTPUTDIR)/$(DOCS_REFERENCE_BUNDLE_NAME)
+STATIC_JDK_BUNDLE := $(BUNDLES_OUTPUTDIR)/$(STATIC_JDK_BUNDLE_NAME)
JCOV_BUNDLE := $(BUNDLES_OUTPUTDIR)/$(JCOV_BUNDLE_NAME)
# This macro is called to allow inclusion of closed source counterparts.
diff --git a/make/autoconf/toolchain.m4 b/make/autoconf/toolchain.m4
index 9013f9cf9221f..b7a010746862e 100644
--- a/make/autoconf/toolchain.m4
+++ b/make/autoconf/toolchain.m4
@@ -291,6 +291,11 @@ AC_DEFUN_ONCE([TOOLCHAIN_PRE_DETECTION],
# For Xcode, we set the Xcode version as TOOLCHAIN_VERSION
TOOLCHAIN_VERSION=`$ECHO $XCODE_VERSION_OUTPUT | $CUT -f 2 -d ' '`
TOOLCHAIN_DESCRIPTION="$TOOLCHAIN_DESCRIPTION from Xcode $TOOLCHAIN_VERSION"
+ if test "x$TOOLCHAIN_VERSION" = "x16" || test "x$TOOLCHAIN_VERSION" = "x16.1" ; then
+ AC_MSG_NOTICE([Xcode $TOOLCHAIN_VERSION has a compiler bug that causes the build to fail.])
+ AC_MSG_NOTICE([Please use Xcode 16.2 or later, or a version prior to 16.])
+ AC_MSG_ERROR([Compiler version is not supported.])
+ fi
fi
fi
AC_SUBST(TOOLCHAIN_VERSION)
diff --git a/make/autoconf/toolchain_microsoft.m4 b/make/autoconf/toolchain_microsoft.m4
index 4f970df7b5f7c..17ad2666b3ab8 100644
--- a/make/autoconf/toolchain_microsoft.m4
+++ b/make/autoconf/toolchain_microsoft.m4
@@ -1,5 +1,5 @@
#
-# Copyright (c) 2011, 2024, Oracle and/or its affiliates. All rights reserved.
+# Copyright (c) 2011, 2025, Oracle and/or its affiliates. All rights reserved.
# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
#
# This code is free software; you can redistribute it and/or modify it
@@ -87,7 +87,7 @@ AC_DEFUN([TOOLCHAIN_CHECK_POSSIBLE_VISUAL_STUDIO_ROOT],
elif test "x$TARGET_CPU" = xaarch64; then
# for host x86-64, target aarch64
# aarch64 requires Visual Studio 16.8 or higher
- VCVARSFILES="vcvarsamd64_arm64.bat vcvarsx86_arm64.bat"
+ VCVARSFILES="vcvarsarm64.bat vcvarsamd64_arm64.bat vcvarsx86_arm64.bat"
fi
for VCVARSFILE in $VCVARSFILES; do
diff --git a/make/autoconf/util_paths.m4 b/make/autoconf/util_paths.m4
index 9e3e5472c9e49..40864680aadd8 100644
--- a/make/autoconf/util_paths.m4
+++ b/make/autoconf/util_paths.m4
@@ -1,5 +1,5 @@
#
-# Copyright (c) 2011, 2024, Oracle and/or its affiliates. All rights reserved.
+# Copyright (c) 2011, 2025, Oracle and/or its affiliates. All rights reserved.
# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
#
# This code is free software; you can redistribute it and/or modify it
@@ -58,21 +58,32 @@ AC_DEFUN([UTIL_PREPEND_TO_PATH],
# 2) The path will be absolute, and it will be in unix-style (on
# cygwin).
# $1: The name of the variable to fix
-# $2: if NOFAIL, errors will be silently ignored
+# $2: if NOFAIL, if the path cannot be resolved then errors will not be
+# reported and an empty path will be set
AC_DEFUN([UTIL_FIXUP_PATH],
[
# Only process if variable expands to non-empty
path="[$]$1"
if test "x$path" != x; then
if test "x$OPENJDK_BUILD_OS" = "xwindows"; then
- if test "x$2" = "xNOFAIL"; then
- quiet_option="-q"
+ imported_path=`$FIXPATH_BASE -q import "$path"`
+ if test $? -ne 0 || test ! -e $imported_path; then
+ if test "x$2" != "xNOFAIL"; then
+ AC_MSG_NOTICE([The path of $1, which is given as "$path", can not be properly resolved.])
+ AC_MSG_NOTICE([Please see the section "Special Considerations" in building.md.])
+ AC_MSG_NOTICE([This is the error message given by fixpath:])
+ # Rerun fixpath without -q to get an error message
+ $FIXPATH_BASE import "$path"
+ AC_MSG_ERROR([Cannot continue])
+ else
+ imported_path=""
+ fi
fi
- imported_path=`$FIXPATH_BASE $quiet_option import "$path"`
- $FIXPATH_BASE verify "$imported_path"
+
+ $FIXPATH_BASE -q verify "$imported_path"
if test $? -ne 0; then
if test "x$2" != "xNOFAIL"; then
- AC_MSG_ERROR([The path of $1, which resolves as "$path", could not be imported.])
+ AC_MSG_ERROR([The path of $1, which resolves as "$path", could not be verified.])
else
imported_path=""
fi
@@ -83,7 +94,7 @@ AC_DEFUN([UTIL_FIXUP_PATH],
if test "x$imported_path_lower" != "x$orig_path_lower"; then
$1="$imported_path"
fi
- else
+ else # non-Windows
[ if [[ "$path" =~ " " ]]; then ]
if test "x$2" != "xNOFAIL"; then
AC_MSG_NOTICE([The path of $1, which resolves as "$path", is invalid.])
@@ -186,7 +197,6 @@ AC_DEFUN([UTIL_CHECK_WINENV_EXEC_TYPE],
# it need to be in the PATH.
# $1: The name of the variable to fix
# $2: Where to look for the command (replaces $PATH)
-# $3: set to NOFIXPATH to skip prefixing FIXPATH, even if needed on platform
AC_DEFUN([UTIL_FIXUP_EXECUTABLE],
[
input="[$]$1"
@@ -233,15 +243,19 @@ AC_DEFUN([UTIL_FIXUP_EXECUTABLE],
# This is a path with slashes, don't look at $PATH
if test "x$OPENJDK_BUILD_OS" = "xwindows"; then
# fixpath.sh import will do all heavy lifting for us
- new_path=`$FIXPATH_BASE import "$path"`
+ new_path=`$FIXPATH_BASE -q import "$path"`
- if test ! -e $new_path; then
+ if test $? -ne 0 || test ! -e $new_path; then
# It failed, but maybe spaces were part of the path and not separating
# the command and argument. Retry using that assumption.
- new_path=`$FIXPATH_BASE import "$input"`
- if test ! -e $new_path; then
- AC_MSG_NOTICE([The command for $1, which resolves as "$input", can not be found.])
- AC_MSG_ERROR([Cannot locate $input])
+ new_path=`$FIXPATH_BASE -q import "$input"`
+ if test $? -ne 0 || test ! -e $new_path; then
+ AC_MSG_NOTICE([The command for $1, which is given as "$input", can not be properly resolved.])
+ AC_MSG_NOTICE([Please see the section "Special Considerations" in building.md.])
+ AC_MSG_NOTICE([This is the error message given by fixpath:])
+ # Rerun fixpath without -q to get an error message
+ $FIXPATH_BASE import "$input"
+ AC_MSG_ERROR([Cannot continue])
fi
# It worked, clear all "arguments"
arguments=""
@@ -282,10 +296,6 @@ AC_DEFUN([UTIL_FIXUP_EXECUTABLE],
fi
fi
- if test "x$3" = xNOFIXPATH; then
- fixpath_prefix=""
- fi
-
# Now join together the path and the arguments once again
new_complete="$fixpath_prefix$new_path$arguments"
$1="$new_complete"
@@ -353,7 +363,15 @@ AC_DEFUN([UTIL_SETUP_TOOL],
else
# Otherwise we believe it is a complete path. Use it as it is.
if test ! -x "$tool_command" && test ! -x "${tool_command}.exe"; then
- AC_MSG_ERROR([User supplied tool $1="$tool_command" does not exist or is not executable])
+ # Maybe the path had spaces in it; try again with the entire argument
+ if test ! -x "$tool_override" && test ! -x "${tool_override}.exe"; then
+ AC_MSG_ERROR([User supplied tool $1="$tool_override" does not exist or is not executable])
+ else
+ # We successfully located the executable assuming the spaces were part of the path.
+ # We can't combine using paths with spaces and arguments, so assume tool_args is empty.
+ tool_command="$tool_override"
+ tool_args=""
+ fi
fi
if test ! -x "$tool_command"; then
tool_command="${tool_command}.exe"
@@ -379,7 +397,6 @@ AC_DEFUN([UTIL_SETUP_TOOL],
# $1: variable to set
# $2: executable name (or list of names) to look for
# $3: [path]
-# $4: set to NOFIXPATH to skip prefixing FIXPATH, even if needed on platform
AC_DEFUN([UTIL_LOOKUP_PROGS],
[
UTIL_SETUP_TOOL($1, [
@@ -421,10 +438,8 @@ AC_DEFUN([UTIL_LOOKUP_PROGS],
# If we have FIXPATH enabled, strip all instances of it and prepend
# a single one, to avoid double fixpath prefixing.
- if test "x$4" != xNOFIXPATH; then
- [ if [[ $FIXPATH != "" && $result =~ ^"$FIXPATH " ]]; then ]
- result="\$FIXPATH ${result#"$FIXPATH "}"
- fi
+ [ if [[ $FIXPATH != "" && $result =~ ^"$FIXPATH " ]]; then ]
+ result="\$FIXPATH ${result#"$FIXPATH "}"
fi
AC_MSG_RESULT([$result])
break 2;
@@ -515,6 +530,24 @@ AC_DEFUN([UTIL_ADD_FIXPATH],
fi
])
+################################################################################
+# Return a path to the executable binary from a command line, stripping away
+# any FIXPATH prefix or arguments. The resulting value can be checked for
+# existence using "test -e". The result is returned in a variable named
+# "$1_EXECUTABLE".
+#
+# $1: variable describing the command to get the binary for
+AC_DEFUN([UTIL_GET_EXECUTABLE],
+[
+ # Strip the FIXPATH prefix, if any
+ fixpath_stripped="[$]$1"
+ [ if [[ $FIXPATH != "" && $fixpath_stripped =~ ^"$FIXPATH " ]]; then ]
+ fixpath_stripped="${fixpath_stripped#"$FIXPATH "}"
+ fi
+ # Remove any arguments following the binary
+ $1_EXECUTABLE="${fixpath_stripped%% *}"
+])
+
################################################################################
AC_DEFUN([UTIL_REMOVE_SYMBOLIC_LINKS],
[
diff --git a/make/common/FindTests.gmk b/make/common/FindTests.gmk
index 1f3a70b30356a..517bb2973f4e6 100644
--- a/make/common/FindTests.gmk
+++ b/make/common/FindTests.gmk
@@ -58,13 +58,15 @@ ifeq ($(GENERATE_FIND_TESTS_FILE), true)
$(TOPDIR)/test/make/TestMake.gmk
$(call MakeTargetDir)
( $(foreach root, $(JTREG_TESTROOTS), \
- $(PRINTF) "\n$(root)_JTREG_TEST_GROUPS := " ; \
+ $(ECHO) ""; \
+ $(PRINTF) "\n%s_JTREG_TEST_GROUPS := " "$(root)"; \
$(SED) -n -e 's/^\#.*//g' -e 's/\([^ ]*\)\w*=.*/\1/gp' \
$($(root)_JTREG_GROUP_FILES) \
| $(SORT) -u | $(TR) '\n' ' ' ; \
) \
) > $@
- $(PRINTF) "\nMAKE_TEST_TARGETS := " >> $@
+ $(ECHO) "" >> $@
+ $(PRINTF) "MAKE_TEST_TARGETS := " >> $@
$(MAKE) -s --no-print-directory $(MAKE_ARGS) \
SPEC=$(SPEC) -f $(TOPDIR)/test/make/TestMake.gmk print-targets \
TARGETS_FILE=$@
diff --git a/make/common/JarArchive.gmk b/make/common/JarArchive.gmk
index 1f8ed1bc002ef..26a98f289498b 100644
--- a/make/common/JarArchive.gmk
+++ b/make/common/JarArchive.gmk
@@ -256,7 +256,7 @@ define SetupJarArchiveBody
$$(if $$($1_JARMAIN), \
$(ECHO) "Main-Class: $$(strip $$($1_JARMAIN))" >> $$($1_MANIFEST_FILE) $$(NEWLINE)) \
$$(if $$($1_EXTRA_MANIFEST_ATTR), \
- $(PRINTF) "$$($1_EXTRA_MANIFEST_ATTR)\n" >> $$($1_MANIFEST_FILE) $$(NEWLINE)) \
+ $(ECHO) "$$($1_EXTRA_MANIFEST_ATTR)" >> $$($1_MANIFEST_FILE) $$(NEWLINE)) \
$(ECHO) Creating $$($1_NAME) $$(NEWLINE) \
$$($1_JAR_CMD) --create $$($1_JAR_OPTIONS) --file $$@ --manifest $$($1_MANIFEST_FILE) $$(NEWLINE) \
$$($1_SCAPTURE_CONTENTS) \
diff --git a/make/common/JavaCompilation.gmk b/make/common/JavaCompilation.gmk
index f48aefcd51700..c5a74413de19e 100644
--- a/make/common/JavaCompilation.gmk
+++ b/make/common/JavaCompilation.gmk
@@ -80,15 +80,13 @@ endef
#
# The sed expression does this:
# 1. Add a backslash before any :, = or ! that do not have a backslash already.
-# 2. Apply the file unicode2x.sed which does a whole bunch of \u00XX to \xXX
-# conversions.
-# 3. Delete all lines starting with #.
-# 4. Delete empty lines.
-# 5. Append lines ending with \ with the next line.
-# 6. Remove leading and trailing white space. Note that tabs must be explicit
+# 2. Delete all lines starting with #.
+# 3. Delete empty lines.
+# 4. Append lines ending with \ with the next line.
+# 5. Remove leading and trailing white space. Note that tabs must be explicit
# as sed on macosx does not understand '\t'.
-# 7. Replace the first \= with just =.
-# 8. Finally it's all sorted to create a stable output.
+# 6. Replace the first \= with just =.
+# 7. Finally it's all sorted to create a stable output.
#
# It is assumed that = is the character used for separating names and values.
define add_file_to_clean
@@ -108,7 +106,6 @@ define add_file_to_clean
( $(CAT) $$< && $(ECHO) "" ) \
| $(SED) -e 's/\([^\\]\):/\1\\:/g' -e 's/\([^\\]\)=/\1\\=/g' \
-e 's/\([^\\]\)!/\1\\!/g' -e 's/^[ ]*#.*/#/g' \
- | $(SED) -f "$$(TOPDIR)/make/common/support/unicode2x.sed" \
| $(SED) -e '/^#/d' -e '/^$$$$/d' \
-e :a -e '/\\$$$$/N; s/\\\n//; ta' \
-e 's/^[ ]*//;s/[ ]*$$$$//' \
@@ -155,6 +152,7 @@ endef
# INCLUDE_FILES "com/sun/SolarisFoobar.java" means only compile this file!
# EXCLUDE_FILES "com/sun/SolarisFoobar.java" means do not compile this particular file!
# "SolarisFoobar.java" means do not compile SolarisFoobar, wherever it is found.
+# EXCLUDE_PATTERNS Exclude files matching any of these substrings
# EXTRA_FILES List of extra source files to include in compilation. Can be used to
# specify files that need to be generated by other rules first.
# HEADERS path to directory where all generated c-headers are written.
@@ -264,14 +262,17 @@ define SetupJavaCompilationBody
$$(error Invalid value for COMPILER in SetupJavaCompilation for $1: '$$($1_COMPILER)')
endif
- # Allow overriding on the command line
- JAVA_WARNINGS_ARE_ERRORS ?= -Werror
-
# Tell javac to do exactly as told and no more
- PARANOIA_FLAGS := -implicit:none -Xprefer:source -XDignore.symbol.file=true -encoding ascii
+ PARANOIA_FLAGS := -implicit:none -Xprefer:source -XDignore.symbol.file=true
- $1_FLAGS += -g -Xlint:all $$($1_TARGET_RELEASE) $$(PARANOIA_FLAGS) $$(JAVA_WARNINGS_ARE_ERRORS)
+ $1_FLAGS += -g -Xlint:all $$($1_TARGET_RELEASE) $$(PARANOIA_FLAGS)
$1_FLAGS += $$($1_JAVAC_FLAGS)
+ # Set character encoding in source
+ $1_FLAGS += -encoding utf-8
+
+ ifeq ($$(JAVA_WARNINGS_AS_ERRORS), true)
+ $1_FLAGS += -Werror
+ endif
ifneq ($$($1_DISABLED_WARNINGS), )
$1_FLAGS += -Xlint:$$(call CommaList, $$(addprefix -, $$($1_DISABLED_WARNINGS)))
@@ -332,6 +333,20 @@ define SetupJavaCompilationBody
$1_INCLUDE_PATTERN += $$(foreach i, $$($1_SRC), $$(addprefix $$i/, $$(addsuffix /%, $$($1_INCLUDES))))
endif
+ ifneq ($$($1_EXCLUDE_PATTERNS), )
+ # We must not match the exclude pattern against the src roots, so first
+ # strip the src prefixes from the absolute file paths in SRCS.
+ $1_SRCS_WITHOUT_ROOTS := $$(foreach i, $$($1_SRC), \
+ $$(patsubst $$i/%,%, $$(filter $$i/%, $$($1_SRCS))))
+ $1_EXCLUDE_PATTERNS_WITHOUT_ROOTS := $$(call containing, \
+ $$($1_EXCLUDE_PATTERNS), $$($1_SRCS_WITHOUT_ROOTS))
+ # The add back all possible src prefixes; this will generate more paths
+ # than really exists, but it does not matter since we will use this as
+ # input to filter-out.
+ $1_EXCLUDE_PATTERN += $$(foreach i, $$($1_SRC), $$(addprefix $$i/, \
+ $$($1_EXCLUDE_PATTERNS_WITHOUT_ROOTS)))
+ endif
+
# Apply include/exclude patterns to java sources
ifneq ($$($1_EXCLUDE_PATTERN), )
$1_SRCS := $$(filter-out $$($1_EXCLUDE_PATTERN), $$($1_SRCS))
diff --git a/make/common/JdkNativeCompilation.gmk b/make/common/JdkNativeCompilation.gmk
index 372ad39305c59..0285669ffd8ff 100644
--- a/make/common/JdkNativeCompilation.gmk
+++ b/make/common/JdkNativeCompilation.gmk
@@ -227,6 +227,8 @@ endef
GLOBAL_VERSION_INFO_RESOURCE := $(TOPDIR)/src/java.base/windows/native/common/version.rc
+# \xA9 is the copyright symbol in ANSI encoding (Windows-1252), which rc.exe
+# assumes the resource file is in.
JDK_RCFLAGS=$(RCFLAGS) \
-D"JDK_VERSION_STRING=$(VERSION_STRING)" \
-D"JDK_COMPANY=$(JDK_RC_COMPANY_NAME)" \
diff --git a/make/common/MakeFileStart.gmk b/make/common/MakeFileStart.gmk
index f1dd0abb792c3..f18c623d3e8d2 100644
--- a/make/common/MakeFileStart.gmk
+++ b/make/common/MakeFileStart.gmk
@@ -47,7 +47,7 @@ endif
# We need spec.gmk to get $(TOPDIR)
include $(SPEC)
-THIS_MAKEFILE := $(patsubst make/%,%,$(patsubst $(TOPDIR)/%,%,$(THIS_MAKEFILE_PATH)))
+THIS_MAKEFILE := $(patsubst make/%,%,$(patsubst $(TOPDIR_ALT)/make/%,%,$(patsubst $(TOPDIR)/%,%,$(THIS_MAKEFILE_PATH))))
ifeq ($(LOG_FLOW), true)
$(info :Enter $(THIS_MAKEFILE))
diff --git a/make/common/MakeIncludeStart.gmk b/make/common/MakeIncludeStart.gmk
index d09f027c1d38d..3904633f9f218 100644
--- a/make/common/MakeIncludeStart.gmk
+++ b/make/common/MakeIncludeStart.gmk
@@ -29,7 +29,7 @@
# Get the next to last word (by prepending a padding element)
THIS_INCLUDE_PATH := $(word $(words ${MAKEFILE_LIST}),padding ${MAKEFILE_LIST})
-THIS_INCLUDE := $(patsubst $(TOPDIR)/make/%,%,$(THIS_INCLUDE_PATH))
+THIS_INCLUDE := $(patsubst $(TOPDIR_ALT)/make/%,%,$(patsubst $(TOPDIR)/make/%,%,$(THIS_INCLUDE_PATH)))
# Print an indented message, also counting the top-level makefile as a level
ifneq ($(INCLUDE_GUARD_$(THIS_INCLUDE)), true)
diff --git a/make/common/Modules.gmk b/make/common/Modules.gmk
index f4f815c740db6..725424d7618da 100644
--- a/make/common/Modules.gmk
+++ b/make/common/Modules.gmk
@@ -180,7 +180,7 @@ ifeq ($(GENERATE_MODULE_DEPS_FILE), true)
$(call MakeTargetDir)
$(RM) $@
$(foreach m, $(MODULE_INFOS), \
- ( $(PRINTF) "DEPS_$(call GetModuleNameFromModuleInfo, $m) := " && \
+ ( $(PRINTF) "DEPS_%s := " "$(call GetModuleNameFromModuleInfo, $m)" && \
$(AWK) -v MODULE=$(call GetModuleNameFromModuleInfo, $m) ' \
BEGIN { if (MODULE != "java.base") printf(" java.base"); } \
/^ *requires/ { sub(/;/, ""); \
@@ -194,7 +194,7 @@ ifeq ($(GENERATE_MODULE_DEPS_FILE), true)
gsub(/\r/, ""); \
printf(" %s", $$0) } \
END { printf("\n") }' $m && \
- $(PRINTF) "TRANSITIVE_MODULES_$(call GetModuleNameFromModuleInfo, $m) := " && \
+ $(PRINTF) "TRANSITIVE_MODULES_%s := " "$(call GetModuleNameFromModuleInfo, $m)" && \
$(AWK) -v MODULE=$(call GetModuleNameFromModuleInfo, $m) ' \
BEGIN { if (MODULE != "java.base") printf(" java.base"); } \
/^ *requires *transitive/ { \
diff --git a/make/common/modules/GensrcCommon.gmk b/make/common/modules/GensrcCommon.gmk
index 64d1f71d82e65..2a94c3f9a4200 100644
--- a/make/common/modules/GensrcCommon.gmk
+++ b/make/common/modules/GensrcCommon.gmk
@@ -41,8 +41,8 @@ include $(TOPDIR)/make/ToolsJdk.gmk
define SetupVersionProperties
$(SUPPORT_OUTPUTDIR)/gensrc/$(MODULE)/$$(strip $2):
$$(call MakeTargetDir)
- $(PRINTF) "jdk=$(VERSION_NUMBER)\nfull=$(VERSION_STRING)\nrelease=$(VERSION_SHORT)\n" \
- > $$@
+ $(PRINTF) "jdk=%s\nfull=%s\nrelease=%s\n" \
+ $(VERSION_NUMBER) $(VERSION_STRING) $(VERSION_SHORT) > $$@
$$(strip $1) += $(SUPPORT_OUTPUTDIR)/gensrc/$(MODULE)/$$(strip $2)
endef
diff --git a/make/common/native/Paths.gmk b/make/common/native/Paths.gmk
index ee097b2e134ff..bdb8828eb3279 100644
--- a/make/common/native/Paths.gmk
+++ b/make/common/native/Paths.gmk
@@ -128,10 +128,9 @@ define SetupSourceFiles
# Extract the C/C++ files.
ifneq ($$($1_EXCLUDE_PATTERNS), )
# We must not match the exclude pattern against the src root(s).
- $1_SRCS_WITHOUT_ROOTS := $$($1_SRCS)
- $$(foreach i, $$($1_SRC), $$(eval $1_SRCS_WITHOUT_ROOTS := $$(patsubst \
- $$i/%,%, $$($1_SRCS_WITHOUT_ROOTS))))
- $1_ALL_EXCLUDE_FILES := $$(call containing, $$($1_EXCLUDE_PATTERNS), \
+ $1_SRCS_WITHOUT_ROOTS := $$(foreach i, $$($1_SRC), \
+ $$(patsubst $$i/%,%, $$(filter $$i/%, $$($1_SRCS))))
+ $1_ALL_EXCLUDE_FILES := $$(call containing, $$($1_EXCLUDE_PATTERNS), \
$$($1_SRCS_WITHOUT_ROOTS))
endif
ifneq ($$($1_EXCLUDE_FILES), )
diff --git a/make/common/support/unicode2x.sed b/make/common/support/unicode2x.sed
deleted file mode 100644
index 5188b97fe032a..0000000000000
--- a/make/common/support/unicode2x.sed
+++ /dev/null
@@ -1,100 +0,0 @@
-s/\\u0020/\x20/g
-s/\\u003A/\x3A/g
-s/\\u006B/\x6B/g
-s/\\u0075/\x75/g
-s/\\u00A0/\xA0/g
-s/\\u00A3/\xA3/g
-s/\\u00B0/\xB0/g
-s/\\u00B7/\xB7/g
-s/\\u00BA/\xBA/g
-s/\\u00BF/\xBF/g
-s/\\u00C0/\xC0/g
-s/\\u00C1/\xC1/g
-s/\\u00C2/\xC2/g
-s/\\u00C4/\xC4/g
-s/\\u00C5/\xC5/g
-s/\\u00C8/\xC8/g
-s/\\u00C9/\xC9/g
-s/\\u00CA/\xCA/g
-s/\\u00CD/\xCD/g
-s/\\u00CE/\xCE/g
-s/\\u00D3/\xD3/g
-s/\\u00D4/\xD4/g
-s/\\u00D6/\xD6/g
-s/\\u00DA/\xDA/g
-s/\\u00DC/\xDC/g
-s/\\u00DD/\xDD/g
-s/\\u00DF/\xDF/g
-s/\\u00E0/\xE0/g
-s/\\u00E1/\xE1/g
-s/\\u00E2/\xE2/g
-s/\\u00E3/\xE3/g
-s/\\u00E4/\xE4/g
-s/\\u00E5/\xE5/g
-s/\\u00E6/\xE6/g
-s/\\u00E7/\xE7/g
-s/\\u00E8/\xE8/g
-s/\\u00E9/\xE9/g
-s/\\u00EA/\xEA/g
-s/\\u00EB/\xEB/g
-s/\\u00EC/\xEC/g
-s/\\u00ED/\xED/g
-s/\\u00EE/\xEE/g
-s/\\u00EF/\xEF/g
-s/\\u00F1/\xF1/g
-s/\\u00F2/\xF2/g
-s/\\u00F3/\xF3/g
-s/\\u00F4/\xF4/g
-s/\\u00F5/\xF5/g
-s/\\u00F6/\xF6/g
-s/\\u00F9/\xF9/g
-s/\\u00FA/\xFA/g
-s/\\u00FC/\xFC/g
-s/\\u0020/\x20/g
-s/\\u003f/\x3f/g
-s/\\u006f/\x6f/g
-s/\\u0075/\x75/g
-s/\\u00a0/\xa0/g
-s/\\u00a3/\xa3/g
-s/\\u00b0/\xb0/g
-s/\\u00ba/\xba/g
-s/\\u00bf/\xbf/g
-s/\\u00c1/\xc1/g
-s/\\u00c4/\xc4/g
-s/\\u00c5/\xc5/g
-s/\\u00c8/\xc8/g
-s/\\u00c9/\xc9/g
-s/\\u00ca/\xca/g
-s/\\u00cd/\xcd/g
-s/\\u00d6/\xd6/g
-s/\\u00dc/\xdc/g
-s/\\u00dd/\xdd/g
-s/\\u00df/\xdf/g
-s/\\u00e0/\xe0/g
-s/\\u00e1/\xe1/g
-s/\\u00e2/\xe2/g
-s/\\u00e3/\xe3/g
-s/\\u00e4/\xe4/g
-s/\\u00e5/\xe5/g
-s/\\u00e7/\xe7/g
-s/\\u00e8/\xe8/g
-s/\\u00e9/\xe9/g
-s/\\u00ea/\xea/g
-s/\\u00eb/\xeb/g
-s/\\u00ec/\xec/g
-s/\\u00ed/\xed/g
-s/\\u00ee/\xee/g
-s/\\u00ef/\xef/g
-s/\\u00f0/\xf0/g
-s/\\u00f1/\xf1/g
-s/\\u00f2/\xf2/g
-s/\\u00f3/\xf3/g
-s/\\u00f4/\xf4/g
-s/\\u00f5/\xf5/g
-s/\\u00f6/\xf6/g
-s/\\u00f7/\xf7/g
-s/\\u00f8/\xf8/g
-s/\\u00f9/\xf9/g
-s/\\u00fa/\xfa/g
-s/\\u00fc/\xfc/g
-s/\\u00ff/\xff/g
diff --git a/make/conf/github-actions.conf b/make/conf/github-actions.conf
index c25e51a48e4c6..27845ffbd7aa0 100644
--- a/make/conf/github-actions.conf
+++ b/make/conf/github-actions.conf
@@ -33,8 +33,8 @@ LINUX_X64_BOOT_JDK_URL=https://download.java.net/java/GA/jdk24/1f9ff9062db4449d8
LINUX_X64_BOOT_JDK_SHA256=88b090fa80c6c1d084ec9a755233967458788e2c0777ae2e172230c5c692d7ef
ALPINE_LINUX_X64_BOOT_JDK_EXT=tar.gz
-ALPINE_LINUX_X64_BOOT_JDK_URL=https://github.com/adoptium/temurin24-binaries/releases/download/jdk-24%2B36/OpenJDK24U-jdk_aarch64_alpine-linux_hotspot_24_36.tar.gz
-ALPINE_LINUX_X64_BOOT_JDK_SHA256=4a673456aa6e726b86108a095a21868b7ebcdde050a92b3073d50105ff92f07f
+ALPINE_LINUX_X64_BOOT_JDK_URL=https://github.com/adoptium/temurin24-binaries/releases/download/jdk-24%2B36/OpenJDK24U-jdk_x64_alpine-linux_hotspot_24_36.tar.gz
+ALPINE_LINUX_X64_BOOT_JDK_SHA256=a642608f0da78344ee6812fb1490b8bc1d7ad5a18064c70994d6f330568c51cb
MACOS_AARCH64_BOOT_JDK_EXT=tar.gz
MACOS_AARCH64_BOOT_JDK_URL=https://download.java.net/java/GA/jdk24/1f9ff9062db4449d8ca828c504ffae90/36/GPL/openjdk-24_macos-aarch64_bin.tar.gz
diff --git a/make/conf/jib-profiles.js b/make/conf/jib-profiles.js
index 02474f3dccb73..172ed74f4cddf 100644
--- a/make/conf/jib-profiles.js
+++ b/make/conf/jib-profiles.js
@@ -241,10 +241,10 @@ var getJibProfilesCommon = function (input, data) {
// List of the main profile names used for iteration
common.main_profile_names = [
- "linux-x64", "linux-x86", "macosx-x64", "macosx-aarch64",
+ "macosx-x64", "macosx-aarch64",
"windows-x64", "windows-aarch64",
- "linux-aarch64", "linux-arm32", "linux-ppc64le", "linux-s390x",
- "linux-riscv64"
+ "linux-x64", "linux-aarch64",
+ "linux-arm32", "linux-ppc64le", "linux-s390x", "linux-riscv64"
];
// These are the base settings for all the main build profiles.
@@ -283,9 +283,6 @@ var getJibProfilesCommon = function (input, data) {
labels: "open"
};
- common.configure_args_64bit = ["--with-target-bits=64"];
- common.configure_args_32bit = ["--with-target-bits=32"];
-
/**
* Define common artifacts template for all main profiles
* @param o - Object containing data for artifacts
@@ -412,58 +409,34 @@ var getJibProfilesProfiles = function (input, common, data) {
// Main SE profiles
var profiles = {
-
- "linux-x64": {
- target_os: "linux",
- target_cpu: "x64",
- dependencies: ["devkit", "gtest", "build_devkit", "graphviz", "pandoc", "tidy"],
- configure_args: concat(
- (input.build_cpu == "x64" ? common.configure_args_64bit
- : "--openjdk-target=x86_64-linux-gnu"),
- "--with-zlib=system", "--disable-dtrace",
- (isWsl(input) ? [ "--host=x86_64-unknown-linux-gnu",
- "--build=x86_64-unknown-linux-gnu" ] : [])),
- },
-
- "linux-x86": {
- target_os: "linux",
- target_cpu: "x86",
- build_cpu: "x64",
- dependencies: ["devkit", "gtest", "libffi"],
- configure_args: concat(common.configure_args_32bit, [
- "--with-jvm-variants=minimal,server",
- "--with-zlib=system",
- "--with-libffi=" + input.get("libffi", "home_path"),
- "--enable-libffi-bundling",
- "--enable-fallback-linker"
- ])
- },
-
"macosx-x64": {
target_os: "macosx",
target_cpu: "x64",
dependencies: ["devkit", "gtest", "graphviz", "pandoc", "tidy"],
- configure_args: concat(common.configure_args_64bit, "--with-zlib=system",
+ configure_args: [
+ "--with-zlib=system",
"--with-macosx-version-max=11.00.00",
"--enable-compatible-cds-alignment",
// Use system SetFile instead of the one in the devkit as the
// devkit one may not work on Catalina.
- "SETFILE=/usr/bin/SetFile"),
+ "SETFILE=/usr/bin/SetFile"
+ ],
},
"macosx-aarch64": {
target_os: "macosx",
target_cpu: "aarch64",
dependencies: ["devkit", "gtest", "graphviz", "pandoc", "tidy"],
- configure_args: concat(common.configure_args_64bit,
- "--with-macosx-version-max=11.00.00"),
+ configure_args: [
+ "--with-macosx-version-max=11.00.00"
+ ],
},
"windows-x64": {
target_os: "windows",
target_cpu: "x64",
dependencies: ["devkit", "gtest", "pandoc"],
- configure_args: concat(common.configure_args_64bit),
+ configure_args: [],
},
"windows-aarch64": {
@@ -475,7 +448,19 @@ var getJibProfilesProfiles = function (input, common, data) {
],
},
- "linux-aarch64": {
+ "linux-x64": {
+ target_os: "linux",
+ target_cpu: "x64",
+ dependencies: ["devkit", "gtest", "build_devkit", "graphviz", "pandoc", "tidy"],
+ configure_args: concat(
+ "--with-zlib=system",
+ "--disable-dtrace",
+ (cross_compiling ? [ "--openjdk-target=x86_64-linux-gnu" ] : []),
+ (isWsl(input) ? [ "--host=x86_64-unknown-linux-gnu",
+ "--build=x86_64-unknown-linux-gnu" ] : [])),
+ },
+
+ "linux-aarch64": {
target_os: "linux",
target_cpu: "aarch64",
dependencies: ["devkit", "gtest", "build_devkit", "graphviz", "pandoc", "tidy"],
@@ -492,8 +477,10 @@ var getJibProfilesProfiles = function (input, common, data) {
build_cpu: "x64",
dependencies: ["devkit", "gtest", "build_devkit"],
configure_args: [
- "--openjdk-target=arm-linux-gnueabihf", "--with-freetype=bundled",
- "--with-abi-profile=arm-vfp-hflt", "--disable-warnings-as-errors"
+ "--openjdk-target=arm-linux-gnueabihf",
+ "--with-freetype=bundled",
+ "--with-abi-profile=arm-vfp-hflt",
+ "--disable-warnings-as-errors"
],
},
@@ -503,7 +490,8 @@ var getJibProfilesProfiles = function (input, common, data) {
build_cpu: "x64",
dependencies: ["devkit", "gtest", "build_devkit"],
configure_args: [
- "--openjdk-target=ppc64le-linux-gnu", "--with-freetype=bundled",
+ "--openjdk-target=ppc64le-linux-gnu",
+ "--with-freetype=bundled",
"--disable-warnings-as-errors"
],
},
@@ -514,7 +502,8 @@ var getJibProfilesProfiles = function (input, common, data) {
build_cpu: "x64",
dependencies: ["devkit", "gtest", "build_devkit"],
configure_args: [
- "--openjdk-target=s390x-linux-gnu", "--with-freetype=bundled",
+ "--openjdk-target=s390x-linux-gnu",
+ "--with-freetype=bundled",
"--disable-warnings-as-errors"
],
},
@@ -525,7 +514,8 @@ var getJibProfilesProfiles = function (input, common, data) {
build_cpu: "x64",
dependencies: ["devkit", "gtest", "build_devkit"],
configure_args: [
- "--openjdk-target=riscv64-linux-gnu", "--with-freetype=bundled",
+ "--openjdk-target=riscv64-linux-gnu",
+ "--with-freetype=bundled",
"--disable-warnings-as-errors"
],
},
@@ -586,24 +576,24 @@ var getJibProfilesProfiles = function (input, common, data) {
target_os: "linux",
target_cpu: "x64",
dependencies: ["devkit", "gtest", "libffi"],
- configure_args: concat(common.configure_args_64bit, [
+ configure_args: [
"--with-zlib=system",
"--with-jvm-variants=zero",
"--with-libffi=" + input.get("libffi", "home_path"),
"--enable-libffi-bundling",
- ])
+ ]
},
"linux-aarch64-zero": {
target_os: "linux",
target_cpu: "aarch64",
dependencies: ["devkit", "gtest", "libffi"],
- configure_args: concat(common.configure_args_64bit, [
+ configure_args: [
"--with-zlib=system",
"--with-jvm-variants=zero",
"--with-libffi=" + input.get("libffi", "home_path"),
"--enable-libffi-bundling"
- ])
+ ]
},
"linux-x86-zero": {
@@ -611,12 +601,13 @@ var getJibProfilesProfiles = function (input, common, data) {
target_cpu: "x86",
build_cpu: "x64",
dependencies: ["devkit", "gtest", "libffi"],
- configure_args: concat(common.configure_args_32bit, [
+ configure_args: [
+ "--with-target-bits=32",
"--with-zlib=system",
"--with-jvm-variants=zero",
"--with-libffi=" + input.get("libffi", "home_path"),
"--enable-libffi-bundling"
- ])
+ ]
}
}
profiles = concatObjects(profiles, zeroProfiles);
@@ -635,8 +626,10 @@ var getJibProfilesProfiles = function (input, common, data) {
target_os: "linux",
target_cpu: "x64",
dependencies: ["devkit", "gtest"],
- configure_args: concat(common.configure_args_64bit,
- "--with-zlib=system", "--disable-precompiled-headers"),
+ configure_args: [
+ "--with-zlib=system",
+ "--disable-precompiled-headers"
+ ],
},
};
profiles = concatObjects(profiles, noPchProfiles);
@@ -693,9 +686,6 @@ var getJibProfilesProfiles = function (input, common, data) {
"linux-x64": {
platform: "linux-x64",
},
- "linux-x86": {
- platform: "linux-x86",
- },
"macosx-x64": {
platform: "macos-x64",
jdk_subdir: "jdk-" + data.version + ".jdk/Contents/Home",
@@ -1088,14 +1078,14 @@ var getJibProfilesProfiles = function (input, common, data) {
var getJibProfilesDependencies = function (input, common) {
var devkit_platform_revisions = {
- linux_x64: "gcc13.2.0-OL6.4+1.0",
- macosx: "Xcode14.3.1+1.0",
- windows_x64: "VS2022-17.6.5+1.0",
- linux_aarch64: "gcc13.2.0-OL7.6+1.0",
+ linux_x64: "gcc14.2.0-OL6.4+1.0",
+ macosx: "Xcode15.4+1.0",
+ windows_x64: "VS2022-17.13.2+1.0",
+ linux_aarch64: "gcc14.2.0-OL7.6+1.0",
linux_arm: "gcc8.2.0-Fedora27+1.0",
- linux_ppc64le: "gcc13.2.0-Fedora_41+1.0",
- linux_s390x: "gcc13.2.0-Fedora_41+1.0",
- linux_riscv64: "gcc13.2.0-Fedora_41+1.0"
+ linux_ppc64le: "gcc14.2.0-Fedora_41+1.0",
+ linux_s390x: "gcc14.2.0-Fedora_41+1.0",
+ linux_riscv64: "gcc14.2.0-Fedora_41+1.0"
};
var devkit_platform = (input.target_cpu == "x86"
diff --git a/make/data/cldr/LICENSE b/make/data/cldr/LICENSE
index 9065fe54d8b9e..ca907d75617c8 100644
--- a/make/data/cldr/LICENSE
+++ b/make/data/cldr/LICENSE
@@ -1,4 +1,4 @@
-UNICODE LICENSE V3
+UNICODE LICENSE V3
COPYRIGHT AND PERMISSION NOTICE
diff --git a/make/data/ubsan/ubsan_default_options.c b/make/data/ubsan/ubsan_default_options.c
index 011d1a675a90f..05e4722e45a27 100644
--- a/make/data/ubsan/ubsan_default_options.c
+++ b/make/data/ubsan/ubsan_default_options.c
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2022, 2023, Oracle and/or its affiliates. All rights reserved.
+ * Copyright (c) 2022, 2025, Oracle and/or its affiliates. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
@@ -43,6 +43,18 @@
#define ATTRIBUTE_USED
#endif
+// On AIX, the llvm_symbolizer is not found out of the box, so we have to provide the
+// full qualified llvm_symbolizer path in the __ubsan_default_options() function.
+// To get it here we compile our sources with an additional define LLVM_SYMBOLIZER
+// containing the path, which we set in make/autoconf/jdk-options.m4.
+#ifdef LLVM_SYMBOLIZER
+#define _LLVM_SYMBOLIZER(X) ",external_symbolizer_path=" X_LLVM_SYMBOLIZER(X)
+#define X_LLVM_SYMBOLIZER(X) #X
+#else
+#define LLVM_SYMBOLIZER
+#define _LLVM_SYMBOLIZER(X)
+#endif
+
// Override weak symbol exposed by UBSan to override default options. This is called by UBSan
// extremely early during library loading, before main is called. We need to override the default
// options because by default UBSan only prints a warning for each occurrence. We want jtreg tests
@@ -50,5 +62,5 @@
// thread so it is easier to track down. You can override these options by setting the environment
// variable UBSAN_OPTIONS.
ATTRIBUTE_DEFAULT_VISIBILITY ATTRIBUTE_USED const char* __ubsan_default_options() {
- return "halt_on_error=1,print_stacktrace=1";
+ return "halt_on_error=1,print_stacktrace=1" _LLVM_SYMBOLIZER(LLVM_SYMBOLIZER);
}
diff --git a/make/devkit/Tools.gmk b/make/devkit/Tools.gmk
index 249eaa6624715..1b9240df49c8c 100644
--- a/make/devkit/Tools.gmk
+++ b/make/devkit/Tools.gmk
@@ -39,6 +39,8 @@
# Fix this...
#
+uppercase = $(shell echo $1 | tr a-z A-Z)
+
$(info TARGET=$(TARGET))
$(info HOST=$(HOST))
$(info BUILD=$(BUILD))
@@ -91,89 +93,28 @@ endif
################################################################################
# Define external dependencies
-# Latest that could be made to work.
-GCC_VER := 13.2.0
-ifeq ($(GCC_VER), 13.2.0)
- gcc_ver := gcc-13.2.0
- binutils_ver := binutils-2.41
- ccache_ver := ccache-3.7.12
- mpfr_ver := mpfr-4.2.0
- gmp_ver := gmp-6.3.0
- mpc_ver := mpc-1.3.1
- gdb_ver := gdb-13.2
- REQUIRED_MIN_MAKE_MAJOR_VERSION := 4
-else ifeq ($(GCC_VER), 11.3.0)
- gcc_ver := gcc-11.3.0
- binutils_ver := binutils-2.39
- ccache_ver := ccache-3.7.12
- mpfr_ver := mpfr-4.1.1
- gmp_ver := gmp-6.2.1
- mpc_ver := mpc-1.2.1
- gdb_ver := gdb-11.2
- REQUIRED_MIN_MAKE_MAJOR_VERSION := 4
-else ifeq ($(GCC_VER), 11.2.0)
- gcc_ver := gcc-11.2.0
- binutils_ver := binutils-2.37
- ccache_ver := ccache-3.7.12
- mpfr_ver := mpfr-4.1.0
- gmp_ver := gmp-6.2.1
- mpc_ver := mpc-1.2.1
- gdb_ver := gdb-11.1
- REQUIRED_MIN_MAKE_MAJOR_VERSION := 4
-else ifeq ($(GCC_VER), 10.3.0)
- gcc_ver := gcc-10.3.0
- binutils_ver := binutils-2.36.1
- ccache_ver := ccache-3.7.11
- mpfr_ver := mpfr-4.1.0
- gmp_ver := gmp-6.2.0
- mpc_ver := mpc-1.1.0
- gdb_ver := gdb-10.1
- REQUIRED_MIN_MAKE_MAJOR_VERSION := 4
-else ifeq ($(GCC_VER), 10.2.0)
- gcc_ver := gcc-10.2.0
- binutils_ver := binutils-2.35
- ccache_ver := ccache-3.7.11
- mpfr_ver := mpfr-4.1.0
- gmp_ver := gmp-6.2.0
- mpc_ver := mpc-1.1.0
- gdb_ver := gdb-9.2
- REQUIRED_MIN_MAKE_MAJOR_VERSION := 4
-else ifeq ($(GCC_VER), 9.2.0)
- gcc_ver := gcc-9.2.0
- binutils_ver := binutils-2.34
- ccache_ver := ccache-3.7.3
- mpfr_ver := mpfr-3.1.5
- gmp_ver := gmp-6.1.2
- mpc_ver := mpc-1.0.3
- gdb_ver := gdb-8.3
-else ifeq ($(GCC_VER), 8.3.0)
- gcc_ver := gcc-8.3.0
- binutils_ver := binutils-2.32
- ccache_ver := ccache-3.7.3
- mpfr_ver := mpfr-3.1.5
- gmp_ver := gmp-6.1.2
- mpc_ver := mpc-1.0.3
- gdb_ver := gdb-8.3
-else ifeq ($(GCC_VER), 7.3.0)
- gcc_ver := gcc-7.3.0
- binutils_ver := binutils-2.30
- ccache_ver := ccache-3.3.6
- mpfr_ver := mpfr-3.1.5
- gmp_ver := gmp-6.1.2
- mpc_ver := mpc-1.0.3
- gdb_ver := gdb-8.1
-else ifeq ($(GCC_VER), 4.9.2)
- gcc_ver := gcc-4.9.2
- binutils_ver := binutils-2.25
- ccache_ver := ccache-3.2.1
- mpfr_ver := mpfr-3.0.1
- gmp_ver := gmp-4.3.2
- mpc_ver := mpc-1.0.1
- gdb_ver := gdb-7.12.1
-else
- $(error Unsupported GCC version)
-endif
+gcc_ver_only := 14.2.0
+binutils_ver_only := 2.43
+ccache_ver_only := 4.10.2
+CCACHE_CMAKE_BASED := 1
+mpfr_ver_only := 4.2.1
+gmp_ver_only := 6.3.0
+mpc_ver_only := 1.3.1
+gdb_ver_only := 15.2
+
+dependencies := gcc binutils ccache mpfr gmp mpc gdb
+
+$(foreach dep,$(dependencies),$(eval $(dep)_ver := $(dep)-$($(dep)_ver_only)))
+
+GCC := http://ftp.gnu.org/pub/gnu/gcc/$(gcc_ver)/$(gcc_ver).tar.xz
+BINUTILS := http://ftp.gnu.org/pub/gnu/binutils/$(binutils_ver).tar.gz
+CCACHE := https://github.com/ccache/ccache/releases/download/v$(ccache_ver_only)/$(ccache_ver).tar.xz
+MPFR := https://www.mpfr.org/$(mpfr_ver)/$(mpfr_ver).tar.bz2
+GMP := http://ftp.gnu.org/pub/gnu/gmp/$(gmp_ver).tar.bz2
+MPC := http://ftp.gnu.org/pub/gnu/mpc/$(mpc_ver).tar.gz
+GDB := http://ftp.gnu.org/gnu/gdb/$(gdb_ver).tar.xz
+REQUIRED_MIN_MAKE_MAJOR_VERSION := 4
ifneq ($(REQUIRED_MIN_MAKE_MAJOR_VERSION),)
MAKE_MAJOR_VERSION := $(word 1,$(subst ., ,$(MAKE_VERSION)))
SUPPORTED_MAKE_VERSION := $(shell [ $(MAKE_MAJOR_VERSION) -ge $(REQUIRED_MIN_MAKE_MAJOR_VERSION) ] && echo true)
@@ -182,17 +123,6 @@ ifneq ($(REQUIRED_MIN_MAKE_MAJOR_VERSION),)
endif
endif
-ccache_ver_only := $(patsubst ccache-%,%,$(ccache_ver))
-
-
-GCC := http://ftp.gnu.org/pub/gnu/gcc/$(gcc_ver)/$(gcc_ver).tar.xz
-BINUTILS := http://ftp.gnu.org/pub/gnu/binutils/$(binutils_ver).tar.gz
-CCACHE := https://github.com/ccache/ccache/releases/download/v$(ccache_ver_only)/$(ccache_ver).tar.xz
-MPFR := https://www.mpfr.org/${mpfr_ver}/${mpfr_ver}.tar.bz2
-GMP := http://ftp.gnu.org/pub/gnu/gmp/${gmp_ver}.tar.bz2
-MPC := http://ftp.gnu.org/pub/gnu/mpc/${mpc_ver}.tar.gz
-GDB := http://ftp.gnu.org/gnu/gdb/${gdb_ver}.tar.xz
-
# RPMs used by all BASE_OS
RPM_LIST := \
$(KERNEL_HEADERS_RPM) \
@@ -262,10 +192,18 @@ define Download
# Allow override
$(1)_DIRNAME ?= $(basename $(basename $(notdir $($(1)))))
$(1)_DIR = $(abspath $(SRCDIR)/$$($(1)_DIRNAME))
- $(1)_CFG = $$($(1)_DIR)/configure
+ ifeq ($$($(1)_CMAKE_BASED),)
+ $(1)_CFG = $$($(1)_DIR)/configure
+ $(1)_SRC_MARKER = $$($(1)_DIR)/configure
+ $(1)_CONFIG = $(CONFIG)
+ else
+ $(1)_CFG = cmake
+ $(1)_SRC_MARKER = $$($(1)_DIR)/CMakeLists.txt
+ $(1)_CONFIG = $$(CMAKE_CONFIG) $$($(1)_DIR)
+ endif
$(1)_FILE = $(DOWNLOAD)/$(notdir $($(1)))
- $$($(1)_CFG) : $$($(1)_FILE)
+ $$($(1)_SRC_MARKER) : $$($(1)_FILE)
mkdir -p $$(SRCDIR)
tar -C $$(SRCDIR) -xf $$<
$$(foreach p,$$(abspath $$(wildcard patches/$$(ARCH)-$$(notdir $$($(1)_DIR)).patch)), \
@@ -279,7 +217,7 @@ define Download
endef
# Download and unpack all source packages
-$(foreach p,GCC BINUTILS CCACHE MPFR GMP MPC GDB,$(eval $(call Download,$(p))))
+$(foreach dep,$(dependencies),$(eval $(call Download,$(call uppercase,$(dep)))))
################################################################################
# Unpack RPMS
@@ -356,7 +294,7 @@ endif
################################################################################
# Define marker files for each source package to be compiled
-$(foreach t,binutils mpfr gmp mpc gcc ccache gdb,$(eval $(t) = $(TARGETDIR)/$($(t)_ver).done))
+$(foreach dep,$(dependencies),$(eval $(dep) = $(TARGETDIR)/$($(dep)_ver).done))
################################################################################
@@ -365,6 +303,8 @@ CONFIG = --target=$(TARGET) \
--host=$(HOST) --build=$(BUILD) \
--prefix=$(PREFIX)
+CMAKE_CONFIG = -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=$(PREFIX)
+
PATHEXT = $(PREFIX)/bin:
PATHPRE = PATH=$(PATHEXT)$(PATH)
@@ -576,6 +516,8 @@ ifeq ($(HOST), $(TARGET))
$(PATHPRE) $(ENVS) CFLAGS="$(CFLAGS)" $(GDB_CFG) \
$(CONFIG) \
--with-sysroot=$(SYSROOT) \
+ --with-mpfr=$(PREFIX) \
+ --with-gmp=$(PREFIX) \
) > $(@D)/log.config 2>&1
@echo 'done'
@@ -591,13 +533,13 @@ endif
################################################################################
# very straightforward. just build a ccache. it is only for host.
$(BUILDDIR)/$(ccache_ver)/Makefile \
- : $(CCACHE_CFG)
+ : $(CCACHE_SRC_MARKER)
$(info Configuring $@. Log in $(@D)/log.config)
@mkdir -p $(@D)
@( \
cd $(@D) ; \
$(PATHPRE) $(ENVS) $(CCACHE_CFG) \
- $(CONFIG) \
+ $(CCACHE_CONFIG) \
) > $(@D)/log.config 2>&1
@echo 'done'
@@ -699,10 +641,18 @@ ifeq ($(TARGET), $(HOST))
ln -s $(TARGET)-$* $@
missing-links := $(addprefix $(PREFIX)/bin/, \
- addr2line ar as c++ c++filt dwp elfedit g++ gcc gcc-$(GCC_VER) gprof ld ld.bfd \
+ addr2line ar as c++ c++filt dwp elfedit g++ gcc gcc-$(gcc_ver_only) gprof ld ld.bfd \
ld.gold nm objcopy objdump ranlib readelf size strings strip)
endif
+# Add link to work around "plugin needed to handle lto object" (JDK-8344272)
+$(PREFIX)/lib/bfd-plugins/liblto_plugin.so: $(PREFIX)/libexec/gcc/$(TARGET)/$(gcc_ver_only)/liblto_plugin.so
+ @echo 'Creating missing $(@F) soft link'
+ @mkdir -p $(@D)
+ ln -s $$(realpath -s --relative-to=$(@D) $<) $@
+
+missing-links += $(PREFIX)/lib/bfd-plugins/liblto_plugin.so
+
################################################################################
bfdlib : $(bfdlib)
diff --git a/make/devkit/createAutoconfBundle.sh b/make/devkit/createAutoconfBundle.sh
index 7363b9cd8a71a..ebe9c427f76ea 100644
--- a/make/devkit/createAutoconfBundle.sh
+++ b/make/devkit/createAutoconfBundle.sh
@@ -1,6 +1,6 @@
#!/bin/bash -e
#
-# Copyright (c) 2018, 2024, Oracle and/or its affiliates. All rights reserved.
+# Copyright (c) 2018, 2025, Oracle and/or its affiliates. All rights reserved.
# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
#
# This code is free software; you can redistribute it and/or modify it
@@ -24,10 +24,21 @@
# questions.
#
-# Create a bundle in the current directory, containing what's needed to run
+# Create a bundle in OpenJDK build folder, containing what's needed to run
# the 'autoconf' program by the OpenJDK build. To override TARGET_PLATFORM
# just set the variable before running this script.
+# This script fetches sources from network so make sure your proxy is setup appropriately.
+
+# colored print to highlight some of the logs
+function print_log()
+{
+ Color_Cyan='\033[1;36m' # Cyan
+ Color_Off='\033[0m' # Reset color
+ printf "${Color_Cyan}> $1${Color_Off}\n"
+}
+
+
# Autoconf depends on m4, so download and build that first.
AUTOCONF_VERSION=2.69
M4_VERSION=1.4.18
@@ -58,11 +69,12 @@ MODULE_NAME=autoconf-$TARGET_PLATFORM-$AUTOCONF_VERSION+$PACKAGE_VERSION
BUNDLE_NAME=$MODULE_NAME.tar.gz
SCRIPT_DIR="$(cd "$(dirname $0)" > /dev/null && pwd)"
-OUTPUT_ROOT="${SCRIPT_DIR}/../../build/autoconf"
+BASEDIR="$(cd "$SCRIPT_DIR/../.." > /dev/null && pwd)"
+OUTPUT_ROOT="$BASEDIR/build/autoconf"
-cd $OUTPUT_ROOT
IMAGE_DIR=$OUTPUT_ROOT/$MODULE_NAME
mkdir -p $IMAGE_DIR/usr
+cd $OUTPUT_ROOT
# Download and build m4
@@ -76,7 +88,7 @@ elif test "x$TARGET_PLATFORM" = xcygwin_x86; then
cp /usr/bin/m4 $IMAGE_DIR/usr/bin
elif test "x$TARGET_PLATFORM" = xlinux_x64; then
M4_VERSION=1.4.13-5
- wget http://yum.oracle.com/repo/OracleLinux/OL6/latest/x86_64/getPackage/m4-$M4_VERSION.el6.x86_64.rpm
+ wget https://yum.oracle.com/repo/OracleLinux/OL6/latest/x86_64/getPackage/m4-$M4_VERSION.el6.x86_64.rpm
cd $IMAGE_DIR
rpm2cpio $OUTPUT_ROOT/m4-$M4_VERSION.el6.x86_64.rpm | cpio -d -i
elif test "x$TARGET_PLATFORM" = xlinux_x86; then
@@ -85,27 +97,38 @@ elif test "x$TARGET_PLATFORM" = xlinux_x86; then
cd $IMAGE_DIR
rpm2cpio $OUTPUT_ROOT/m4-$M4_VERSION.el6.i686.rpm | cpio -d -i
else
+ print_log "m4: download"
wget https://ftp.gnu.org/gnu/m4/m4-$M4_VERSION.tar.gz
- tar xzf m4-$M4_VERSION.tar.gz
+ tar -xzf m4-$M4_VERSION.tar.gz
cd m4-$M4_VERSION
+ print_log "m4: configure"
./configure --prefix=$IMAGE_DIR/usr CFLAGS="-w -Wno-everything"
+ print_log "m4: make"
make
+ print_log "m4: make install"
make install
cd ..
fi
# Download and build autoconf
+print_log "autoconf: download"
wget https://ftp.gnu.org/gnu/autoconf/autoconf-$AUTOCONF_VERSION.tar.gz
-tar xzf autoconf-$AUTOCONF_VERSION.tar.gz
+tar -xzf autoconf-$AUTOCONF_VERSION.tar.gz
cd autoconf-$AUTOCONF_VERSION
+print_log "autoconf: configure"
./configure --prefix=$IMAGE_DIR/usr M4=$IMAGE_DIR/usr/bin/m4
+print_log "autoconf: make"
make
+print_log "autoconf: make install"
make install
cd ..
+# The resulting scripts from installation folder use absolute paths to reference other files within installation folder
+print_log "replace absolue paths from installation files with a relative ."
perl -pi -e "s!$IMAGE_DIR/!./!" $IMAGE_DIR/usr/bin/auto* $IMAGE_DIR/usr/share/autoconf/autom4te.cfg
+print_log "creating $IMAGE_DIR/autoconf wrapper script"
cat > $IMAGE_DIR/autoconf << EOF
#!/bin/bash
# Get an absolute path to this script
@@ -123,6 +146,9 @@ PREPEND_INCLUDE="--prepend-include \$this_script_dir/usr/share/autoconf"
exec \$this_script_dir/usr/bin/autoconf \$PREPEND_INCLUDE "\$@"
EOF
+
chmod +x $IMAGE_DIR/autoconf
+
+print_log "archiving $IMAGE_DIR directory as $OUTPUT_ROOT/$BUNDLE_NAME"
cd $IMAGE_DIR
tar -cvzf $OUTPUT_ROOT/$BUNDLE_NAME *
diff --git a/make/devkit/createWindowsDevkit.sh b/make/devkit/createWindowsDevkit.sh
index 0646cb68ef44d..757fb157ad443 100644
--- a/make/devkit/createWindowsDevkit.sh
+++ b/make/devkit/createWindowsDevkit.sh
@@ -56,16 +56,22 @@ BUILD_DIR="${SCRIPT_DIR}/../../build/devkit"
UNAME_SYSTEM=`uname -s`
UNAME_RELEASE=`uname -r`
+UNAME_OS=`uname -o`
# Detect cygwin or WSL
IS_CYGWIN=`echo $UNAME_SYSTEM | grep -i CYGWIN`
IS_WSL=`echo $UNAME_RELEASE | grep Microsoft`
+IS_MSYS=`echo $UNAME_OS | grep -i Msys`
+MSYS2_ARG_CONV_EXCL="*" # make "cmd.exe /c" work for msys2
+CMD_EXE="cmd.exe /c"
if test "x$IS_CYGWIN" != "x"; then
BUILD_ENV="cygwin"
+elif test "x$IS_MSYS" != "x"; then
+ BUILD_ENV="cygwin"
elif test "x$IS_WSL" != "x"; then
BUILD_ENV="wsl"
else
- echo "Unknown environment; only Cygwin and WSL are supported."
+ echo "Unknown environment; only Cygwin/MSYS2/WSL are supported."
exit 1
fi
@@ -76,7 +82,7 @@ elif test "x$BUILD_ENV" = "xwsl"; then
fi
# Work around the insanely named ProgramFiles(x86) env variable
-PROGRAMFILES_X86="$($WINDOWS_PATH_TO_UNIX_PATH "$(cmd.exe /c set | sed -n 's/^ProgramFiles(x86)=//p' | tr -d '\r')")"
+PROGRAMFILES_X86="$($WINDOWS_PATH_TO_UNIX_PATH "$(${CMD_EXE} set | sed -n 's/^ProgramFiles(x86)=//p' | tr -d '\r')")"
PROGRAMFILES="$($WINDOWS_PATH_TO_UNIX_PATH "$PROGRAMFILES")"
case $VS_VERSION in
@@ -99,13 +105,15 @@ esac
# Find Visual Studio installation dir
-VSNNNCOMNTOOLS=`cmd.exe /c echo %VS${VS_VERSION_NUM_NODOT}COMNTOOLS% | tr -d '\r'`
+VSNNNCOMNTOOLS=`${CMD_EXE} echo %VS${VS_VERSION_NUM_NODOT}COMNTOOLS% | tr -d '\r'`
+VSNNNCOMNTOOLS="$($WINDOWS_PATH_TO_UNIX_PATH "$VSNNNCOMNTOOLS")"
if [ -d "$VSNNNCOMNTOOLS" ]; then
- VS_INSTALL_DIR="$($WINDOWS_PATH_TO_UNIX_PATH "$VSNNNCOMNTOOLS/../..")"
+ VS_INSTALL_DIR="$VSNNNCOMNTOOLS/../.."
else
VS_INSTALL_DIR="${MSVC_PROGRAMFILES_DIR}/Microsoft Visual Studio/$VS_VERSION"
VS_INSTALL_DIR="$(ls -d "${VS_INSTALL_DIR}/"{Community,Professional,Enterprise} 2>/dev/null | head -n1)"
fi
+echo "VSNNNCOMNTOOLS: $VSNNNCOMNTOOLS"
echo "VS_INSTALL_DIR: $VS_INSTALL_DIR"
# Extract semantic version
@@ -180,7 +188,11 @@ cp $DEVKIT_ROOT/VC/redist/arm64/$MSVCP_DLL $DEVKIT_ROOT/VC/bin/arm64
################################################################################
# Copy SDK files
-SDK_INSTALL_DIR="$PROGRAMFILES_X86/Windows Kits/$SDK_VERSION"
+SDK_INSTALL_DIR=`${CMD_EXE} echo %WindowsSdkDir% | tr -d '\r'`
+SDK_INSTALL_DIR="$($WINDOWS_PATH_TO_UNIX_PATH "$SDK_INSTALL_DIR")"
+if [ ! -d "$SDK_INSTALL_DIR" ]; then
+ SDK_INSTALL_DIR="$PROGRAMFILES_X86/Windows Kits/$SDK_VERSION"
+fi
echo "SDK_INSTALL_DIR: $SDK_INSTALL_DIR"
SDK_FULL_VERSION="$(ls "$SDK_INSTALL_DIR/bin" | sort -r -n | head -n1)"
diff --git a/make/hotspot/lib/JvmFeatures.gmk b/make/hotspot/lib/JvmFeatures.gmk
index 0a897230f835b..ffea9aa3926b9 100644
--- a/make/hotspot/lib/JvmFeatures.gmk
+++ b/make/hotspot/lib/JvmFeatures.gmk
@@ -125,6 +125,7 @@ endif
ifneq ($(call check-jvm-feature, cds), true)
JVM_CFLAGS_FEATURES += -DINCLUDE_CDS=0
JVM_EXCLUDE_FILES += \
+ aotCodeCache.cpp \
classLoaderDataShared.cpp \
classLoaderExt.cpp \
systemDictionaryShared.cpp
diff --git a/make/jdk/src/classes/build/tools/classlist/HelloClasslist.java b/make/jdk/src/classes/build/tools/classlist/HelloClasslist.java
index 1b930ca752777..fa1b33bb03e34 100644
--- a/make/jdk/src/classes/build/tools/classlist/HelloClasslist.java
+++ b/make/jdk/src/classes/build/tools/classlist/HelloClasslist.java
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2016, 2024, Oracle and/or its affiliates. All rights reserved.
+ * Copyright (c) 2016, 2025, Oracle and/or its affiliates. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
@@ -59,6 +59,7 @@ public class HelloClasslist {
private static final Logger LOGGER = Logger.getLogger("Hello");
+ @SuppressWarnings("restricted")
public static void main(String ... args) throws Throwable {
FileSystems.getDefault();
@@ -141,6 +142,7 @@ public static void main(String ... args) throws Throwable {
HelloClasslist.class.getMethod("staticMethod_V").invoke(null);
var obj = HelloClasslist.class.getMethod("staticMethod_L_L", Object.class).invoke(null, instance);
HelloClasslist.class.getField("field").get(instance);
+ MethodHandles.Lookup.ClassOption.class.getEnumConstants();
// A selection of trivial and relatively common MH operations
invoke(MethodHandles.identity(double.class), 1.0);
@@ -160,6 +162,9 @@ record B(int b) { }
case B b -> b.b;
default -> 17;
};
+ // record run-time methods
+ o.equals(new B(5));
+ o.hashCode();
LOGGER.log(Level.FINE, "Value: " + value);
// The Striped64$Cell is loaded rarely only when there's a contention among
diff --git a/make/jdk/src/classes/build/tools/taglet/PreviewNote.java b/make/jdk/src/classes/build/tools/taglet/PreviewNote.java
new file mode 100644
index 0000000000000..ee3f9bea52717
--- /dev/null
+++ b/make/jdk/src/classes/build/tools/taglet/PreviewNote.java
@@ -0,0 +1,127 @@
+/*
+ * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This code is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 only, as
+ * published by the Free Software Foundation. Oracle designates this
+ * particular file as subject to the "Classpath" exception as provided
+ * by Oracle in the LICENSE file that accompanied this code.
+ *
+ * This code is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ * version 2 for more details (a copy is included in the LICENSE file that
+ * accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License version
+ * 2 along with this work; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
+ *
+ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
+ * or visit www.oracle.com if you need additional information or have any
+ * questions.
+ */
+
+package build.tools.taglet;
+
+import java.util.EnumSet;
+import java.util.List;
+import java.util.Set;
+
+
+import javax.lang.model.element.Element;
+import javax.tools.Diagnostic;
+
+
+import com.sun.source.doctree.DocTree;
+import com.sun.source.doctree.UnknownInlineTagTree;
+import jdk.javadoc.doclet.Doclet;
+import jdk.javadoc.doclet.DocletEnvironment;
+import jdk.javadoc.doclet.Reporter;
+import jdk.javadoc.doclet.StandardDoclet;
+import jdk.javadoc.doclet.Taglet;
+
+import static com.sun.source.doctree.DocTree.Kind.UNKNOWN_INLINE_TAG;
+
+/**
+ * An inline tag to insert a note formatted as preview note.
+ * The tag can be used as follows:
+ *
+ *
+ * {@previewNote jep-number [Preview note heading]}
+ * Preview note content
+ * {@previewNote}
+ *
+ *
+ */
+public class PreviewNote implements Taglet {
+
+ static final String TAG_NAME = "previewNote";
+ Reporter reporter = null;
+
+ @Override
+ public void init(DocletEnvironment env, Doclet doclet) {
+ if (doclet instanceof StandardDoclet stdoclet) {
+ reporter = stdoclet.getReporter();
+ }
+ }
+
+ /**
+ * Returns the set of locations in which the tag may be used.
+ */
+ @Override
+ public Set getAllowedLocations() {
+ return EnumSet.allOf(Taglet.Location.class);
+ }
+
+ @Override
+ public boolean isInlineTag() {
+ return true;
+ }
+
+ @Override
+ public String getName() {
+ return TAG_NAME;
+ }
+
+ @Override
+ public String toString(List extends DocTree> tags, Element elem) {
+
+ for (DocTree tag : tags) {
+ if (tag.getKind() == UNKNOWN_INLINE_TAG) {
+ UnknownInlineTagTree inlineTag = (UnknownInlineTagTree) tag;
+ String[] content = inlineTag.getContent().toString().trim().split("\\s+", 2);
+ if (!content[0].isBlank()) {
+ StringBuilder sb = new StringBuilder("""
+
+ """);
+ if (content.length == 2) {
+ sb.append("""
+
+ """)
+ .append(content[1])
+ .append("""
+
+ """);
+ }
+ sb.append("""
+
+
+ """;
+ }
+ }
+ }
+
+ if (reporter == null) {
+ throw new IllegalArgumentException("@" + TAG_NAME + " taglet content must be begin or end");
+ }
+ reporter.print(Diagnostic.Kind.ERROR, "@" + TAG_NAME + " taglet content must be begin or end");
+ return "";
+ }
+}
diff --git a/make/langtools/src/classes/build/tools/symbolgenerator/CreateSymbols.java b/make/langtools/src/classes/build/tools/symbolgenerator/CreateSymbols.java
index 6faefecd4247a..db8924c79fbf4 100644
--- a/make/langtools/src/classes/build/tools/symbolgenerator/CreateSymbols.java
+++ b/make/langtools/src/classes/build/tools/symbolgenerator/CreateSymbols.java
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2006, 2024, Oracle and/or its affiliates. All rights reserved.
+ * Copyright (c) 2006, 2025, Oracle and/or its affiliates. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
@@ -45,6 +45,18 @@
import java.io.OutputStream;
import java.io.StringWriter;
import java.io.Writer;
+import java.lang.classfile.*;
+import java.lang.classfile.attribute.*;
+import java.lang.classfile.constantpool.ClassEntry;
+import java.lang.classfile.constantpool.ConstantPoolBuilder;
+import java.lang.classfile.constantpool.ConstantValueEntry;
+import java.lang.classfile.constantpool.IntegerEntry;
+import java.lang.classfile.constantpool.Utf8Entry;
+import java.lang.constant.ClassDesc;
+import java.lang.constant.MethodTypeDesc;
+import java.lang.constant.ModuleDesc;
+import java.lang.constant.PackageDesc;
+import java.lang.reflect.AccessFlag;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.FileVisitResult;
@@ -88,63 +100,6 @@
import javax.tools.StandardLocation;
import com.sun.source.util.JavacTask;
-import com.sun.tools.classfile.AccessFlags;
-import com.sun.tools.classfile.Annotation;
-import com.sun.tools.classfile.Annotation.Annotation_element_value;
-import com.sun.tools.classfile.Annotation.Array_element_value;
-import com.sun.tools.classfile.Annotation.Class_element_value;
-import com.sun.tools.classfile.Annotation.Enum_element_value;
-import com.sun.tools.classfile.Annotation.Primitive_element_value;
-import com.sun.tools.classfile.Annotation.element_value;
-import com.sun.tools.classfile.Annotation.element_value_pair;
-import com.sun.tools.classfile.AnnotationDefault_attribute;
-import com.sun.tools.classfile.Attribute;
-import com.sun.tools.classfile.Attributes;
-import com.sun.tools.classfile.ClassFile;
-import com.sun.tools.classfile.ClassWriter;
-import com.sun.tools.classfile.ConstantPool;
-import com.sun.tools.classfile.ConstantPool.CONSTANT_Class_info;
-import com.sun.tools.classfile.ConstantPool.CONSTANT_Double_info;
-import com.sun.tools.classfile.ConstantPool.CONSTANT_Float_info;
-import com.sun.tools.classfile.ConstantPool.CONSTANT_Integer_info;
-import com.sun.tools.classfile.ConstantPool.CONSTANT_Long_info;
-import com.sun.tools.classfile.ConstantPool.CONSTANT_Module_info;
-import com.sun.tools.classfile.ConstantPool.CONSTANT_Package_info;
-import com.sun.tools.classfile.ConstantPool.CONSTANT_String_info;
-import com.sun.tools.classfile.ConstantPool.CONSTANT_Utf8_info;
-import com.sun.tools.classfile.ConstantPool.CPInfo;
-import com.sun.tools.classfile.ConstantPool.InvalidIndex;
-import com.sun.tools.classfile.ConstantPoolException;
-import com.sun.tools.classfile.ConstantValue_attribute;
-import com.sun.tools.classfile.Deprecated_attribute;
-import com.sun.tools.classfile.Descriptor;
-import com.sun.tools.classfile.Exceptions_attribute;
-import com.sun.tools.classfile.Field;
-import com.sun.tools.classfile.InnerClasses_attribute;
-import com.sun.tools.classfile.InnerClasses_attribute.Info;
-import com.sun.tools.classfile.Method;
-import com.sun.tools.classfile.ModulePackages_attribute;
-import com.sun.tools.classfile.MethodParameters_attribute;
-import com.sun.tools.classfile.ModuleMainClass_attribute;
-import com.sun.tools.classfile.ModuleResolution_attribute;
-import com.sun.tools.classfile.ModuleTarget_attribute;
-import com.sun.tools.classfile.Module_attribute;
-import com.sun.tools.classfile.Module_attribute.ExportsEntry;
-import com.sun.tools.classfile.Module_attribute.OpensEntry;
-import com.sun.tools.classfile.Module_attribute.ProvidesEntry;
-import com.sun.tools.classfile.Module_attribute.RequiresEntry;
-import com.sun.tools.classfile.NestHost_attribute;
-import com.sun.tools.classfile.NestMembers_attribute;
-import com.sun.tools.classfile.PermittedSubclasses_attribute;
-import com.sun.tools.classfile.Record_attribute;
-import com.sun.tools.classfile.Record_attribute.ComponentInfo;
-import com.sun.tools.classfile.RuntimeAnnotations_attribute;
-import com.sun.tools.classfile.RuntimeInvisibleAnnotations_attribute;
-import com.sun.tools.classfile.RuntimeInvisibleParameterAnnotations_attribute;
-import com.sun.tools.classfile.RuntimeParameterAnnotations_attribute;
-import com.sun.tools.classfile.RuntimeVisibleAnnotations_attribute;
-import com.sun.tools.classfile.RuntimeVisibleParameterAnnotations_attribute;
-import com.sun.tools.classfile.Signature_attribute;
import com.sun.tools.javac.api.JavacTool;
import com.sun.tools.javac.jvm.Target;
import com.sun.tools.javac.util.Assert;
@@ -154,13 +109,15 @@
import java.util.Optional;
import java.util.function.Consumer;
+import static java.lang.classfile.ClassFile.ACC_PROTECTED;
+import static java.lang.classfile.ClassFile.ACC_PUBLIC;
+
/**
* A tool for processing the .sym.txt files.
*
* To add historical data for JDK N, N >= 11, do the following:
* * cd /src/jdk.compiler/share/data/symbols
- * * /bin/java --add-exports jdk.jdeps/com.sun.tools.classfile=ALL-UNNAMED \
- * --add-exports jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED \
+ * * /bin/java --add-exports jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED \
* --add-exports jdk.compiler/com.sun.tools.javac.jvm=ALL-UNNAMED \
* --add-exports jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED \
* --add-modules jdk.jdeps \
@@ -409,7 +366,7 @@ LoadDescriptions load(Path ctDescriptionWithExtraContent, Path ctDescriptionOpen
reader.moveNext();
break;
default:
- throw new IllegalStateException("Unknown key: " + reader.lineKey);
+ throw new IllegalArgumentException("Unknown key: " + reader.lineKey);
}
}
}
@@ -432,7 +389,7 @@ LoadDescriptions load(Path ctDescriptionWithExtraContent, Path ctDescriptionOpen
reader.moveNext();
break;
default:
- throw new IllegalStateException("Unknown key: " + reader.lineKey);
+ throw new IllegalArgumentException("Unknown key: " + reader.lineKey);
}
}
}
@@ -830,31 +787,12 @@ void writeModule(Map> directory2FileData,
ModuleHeaderDescription header,
char version,
Function version2ModuleVersion) throws IOException {
- List constantPool = new ArrayList<>();
- constantPool.add(null);
- int currentClass = addClass(constantPool, "module-info");
- int superclass = 0;
- int[] interfaces = new int[0];
- AccessFlags flags = new AccessFlags(header.flags);
- Map attributesMap = new HashMap<>();
- String versionString = Character.toString(version);
- addAttributes(moduleDescription, header, constantPool, attributesMap,
- version2ModuleVersion.apply(version));
- Attributes attributes = new Attributes(attributesMap);
- CPInfo[] cpData = constantPool.toArray(new CPInfo[constantPool.size()]);
- ConstantPool cp = new ConstantPool(cpData);
- ClassFile classFile = new ClassFile(0xCAFEBABE,
- Target.DEFAULT.minorVersion,
- Target.DEFAULT.majorVersion,
- cp,
- flags,
- currentClass,
- superclass,
- interfaces,
- new Field[0],
- new Method[0],
- attributes);
+ var classFile = ClassFile.of().build(ClassDesc.of("module-info"), clb -> {
+ clb.withFlags(header.flags);
+ addAttributes(moduleDescription, header, clb, version2ModuleVersion.apply(version));
+ });
+ String versionString = Character.toString(version);
doWrite(directory2FileData, versionString, moduleDescription.name, "module-info" + EXTENSION, classFile);
}
@@ -863,57 +801,26 @@ void writeClass(Map> directory2FileData,
ClassHeaderDescription header,
String module,
String version) throws IOException {
- List constantPool = new ArrayList<>();
- constantPool.add(null);
- List methods = new ArrayList<>();
- for (MethodDescription methDesc : classDescription.methods) {
- if (disjoint(methDesc.versions, version))
- continue;
- Descriptor descriptor = new Descriptor(addString(constantPool, methDesc.descriptor));
- //TODO: LinkedHashMap to avoid param annotations vs. Signature problem in javac's ClassReader:
- Map attributesMap = new LinkedHashMap<>();
- addAttributes(methDesc, constantPool, attributesMap);
- Attributes attributes = new Attributes(attributesMap);
- AccessFlags flags = new AccessFlags(methDesc.flags);
- int nameString = addString(constantPool, methDesc.name);
- methods.add(new Method(flags, nameString, descriptor, attributes));
- }
- List fields = new ArrayList<>();
- for (FieldDescription fieldDesc : classDescription.fields) {
- if (disjoint(fieldDesc.versions, version))
- continue;
- Descriptor descriptor = new Descriptor(addString(constantPool, fieldDesc.descriptor));
- Map attributesMap = new HashMap<>();
- addAttributes(fieldDesc, constantPool, attributesMap);
- Attributes attributes = new Attributes(attributesMap);
- AccessFlags flags = new AccessFlags(fieldDesc.flags);
- int nameString = addString(constantPool, fieldDesc.name);
- fields.add(new Field(flags, nameString, descriptor, attributes));
- }
- int currentClass = addClass(constantPool, classDescription.name);
- int superclass = header.extendsAttr != null ? addClass(constantPool, header.extendsAttr) : 0;
- int[] interfaces = new int[header.implementsAttr.size()];
- int i = 0;
- for (String intf : header.implementsAttr) {
- interfaces[i++] = addClass(constantPool, intf);
- }
- AccessFlags flags = new AccessFlags(header.flags);
- Map attributesMap = new HashMap<>();
- addAttributes(header, constantPool, attributesMap);
- Attributes attributes = new Attributes(attributesMap);
- ConstantPool cp = new ConstantPool(constantPool.toArray(new CPInfo[constantPool.size()]));
- ClassFile classFile = new ClassFile(0xCAFEBABE,
- Target.DEFAULT.minorVersion,
- Target.DEFAULT.majorVersion,
- cp,
- flags,
- currentClass,
- superclass,
- interfaces,
- fields.toArray(new Field[0]),
- methods.toArray(new Method[0]),
- attributes);
-
+ var classFile = ClassFile.of().build(ClassDesc.ofInternalName(classDescription.name), clb -> {
+ if (header.extendsAttr != null)
+ clb.withSuperclass(ClassDesc.ofInternalName(header.extendsAttr));
+ clb.withInterfaceSymbols(header.implementsAttr.stream().map(ClassDesc::ofInternalName).collect(Collectors.toList()))
+ .withFlags(header.flags);
+ for (FieldDescription fieldDesc : classDescription.fields) {
+ if (disjoint(fieldDesc.versions, version))
+ continue;
+ clb.withField(fieldDesc.name, ClassDesc.ofDescriptor(fieldDesc.descriptor), fb -> {
+ addAttributes(fieldDesc, fb);
+ fb.withFlags(fieldDesc.flags);
+ });
+ }
+ for (MethodDescription methDesc : classDescription.methods) {
+ if (disjoint(methDesc.versions, version))
+ continue;
+ clb.withMethod(methDesc.name, MethodTypeDesc.ofDescriptor(methDesc.descriptor), methDesc.flags, mb -> addAttributes(methDesc, mb));
+ }
+ addAttributes(header, clb);
+ });
doWrite(directory2FileData, version, module, classDescription.name + EXTENSION, classFile);
}
@@ -921,19 +828,13 @@ private void doWrite(Map> directory2FileData,
String version,
String moduleName,
String fileName,
- ClassFile classFile) throws IOException {
+ byte[] classFile) throws IOException {
int lastSlash = fileName.lastIndexOf('/');
String pack = lastSlash != (-1) ? fileName.substring(0, lastSlash + 1) : "/";
String directory = version + "/" + moduleName + "/" + pack;
String fullFileName = version + "/" + moduleName + "/" + fileName;
- try (ByteArrayOutputStream out = new ByteArrayOutputStream()) {
- ClassWriter w = new ClassWriter();
-
- w.write(classFile, out);
-
- openDirectory(directory2FileData, directory)
- .add(new FileData(fullFileName, out.toByteArray()));
- }
+ openDirectory(directory2FileData, directory)
+ .add(new FileData(fullFileName, classFile));
}
private Set openDirectory(Map> directory2FileData,
@@ -955,278 +856,147 @@ public FileData(String fileName, byte[] fileData) {
private void addAttributes(ModuleDescription md,
ModuleHeaderDescription header,
- List cp,
- Map attributes,
+ ClassBuilder builder,
String moduleVersion) {
- addGenericAttributes(header, cp, attributes);
+ addGenericAttributes(header, builder);
if (header.moduleResolution != null) {
- int attrIdx = addString(cp, Attribute.ModuleResolution);
- final ModuleResolution_attribute resIdx =
- new ModuleResolution_attribute(attrIdx,
- header.moduleResolution);
- attributes.put(Attribute.ModuleResolution, resIdx);
+ builder.with(ModuleResolutionAttribute.of(header.moduleResolution));
}
if (header.moduleTarget != null) {
- int attrIdx = addString(cp, Attribute.ModuleTarget);
- int targetIdx = addString(cp, header.moduleTarget);
- attributes.put(Attribute.ModuleTarget,
- new ModuleTarget_attribute(attrIdx, targetIdx));
+ builder.with(ModuleTargetAttribute.of(header.moduleTarget));
}
if (header.moduleMainClass != null) {
- int attrIdx = addString(cp, Attribute.ModuleMainClass);
- int targetIdx = addClassName(cp, header.moduleMainClass);
- attributes.put(Attribute.ModuleMainClass,
- new ModuleMainClass_attribute(attrIdx, targetIdx));
- }
- int versionIdx = addString(cp, moduleVersion);
- int attrIdx = addString(cp, Attribute.Module);
- attributes.put(Attribute.Module,
- new Module_attribute(attrIdx,
- addModuleName(cp, md.name),
- 0,
- versionIdx,
- header.requires
- .stream()
- .map(r -> createRequiresEntry(cp, r))
- .collect(Collectors.toList())
- .toArray(new RequiresEntry[0]),
- header.exports
- .stream()
- .map(e -> createExportsEntry(cp, e))
- .collect(Collectors.toList())
- .toArray(new ExportsEntry[0]),
- header.opens
- .stream()
- .map(e -> createOpensEntry(cp, e))
- .collect(Collectors.toList())
- .toArray(new OpensEntry[0]),
- header.uses
- .stream()
- .mapToInt(u -> addClassName(cp, u))
- .toArray(),
- header.provides
- .stream()
- .map(p -> createProvidesEntry(cp, p))
- .collect(Collectors.toList())
- .toArray(new ProvidesEntry[0])));
- addInnerClassesAttribute(header, cp, attributes);
- }
-
- private static RequiresEntry createRequiresEntry(List cp,
- RequiresDescription r) {
- final int idx = addModuleName(cp, r.moduleName);
- return new RequiresEntry(idx,
- r.flags,
- r.version != null
- ? addString(cp, r.version)
- : 0);
- }
-
- private static ExportsEntry createExportsEntry(List cp,
- ExportsDescription export) {
- int[] to;
- if (export.isQualified()) {
- to = export.to.stream()
- .mapToInt(module -> addModuleName(cp, module))
- .toArray();
- } else {
- to = new int[0];
+ builder.with(ModuleMainClassAttribute.of(ClassDesc.ofInternalName(header.moduleMainClass)));
}
- return new ExportsEntry(addPackageName(cp, export.packageName()), 0, to);
- }
-
- private static OpensEntry createOpensEntry(List cp, String e) {
- return new OpensEntry(addPackageName(cp, e), 0, new int[0]);
- }
-
- private static ProvidesEntry createProvidesEntry(List cp,
- ModuleHeaderDescription.ProvidesDescription p) {
- final int idx = addClassName(cp, p.interfaceName);
- return new ProvidesEntry(idx, p.implNames
- .stream()
- .mapToInt(i -> addClassName(cp, i))
- .toArray());
+ builder.with(ModuleAttribute.of(ModuleDesc.of(md.name), mb -> {
+ mb.moduleVersion(moduleVersion);
+ for (var req : header.requires) {
+ mb.requires(ModuleDesc.of(req.moduleName), req.flags, req.version); // nullable version
+ }
+ for (var exp : header.exports) {
+ if (exp.isQualified()) {
+ mb.exports(PackageDesc.ofInternalName(exp.packageName()), 0, exp.to.stream().map(ModuleDesc::of).toArray(ModuleDesc[]::new));
+ } else {
+ mb.exports(PackageDesc.ofInternalName(exp.packageName()), 0);
+ }
+ }
+ for (var open : header.opens) {
+ mb.opens(PackageDesc.ofInternalName(open), 0);
+ }
+ for (var use : header.uses) {
+ mb.uses(ClassDesc.ofInternalName(use));
+ }
+ for (var provide : header.provides) {
+ mb.provides(ClassDesc.ofInternalName(provide.interfaceName),
+ provide.implNames.stream().map(ClassDesc::ofInternalName).toArray(ClassDesc[]::new));
+ }
+ }));
+ addInnerClassesAttribute(header, builder);
}
- private void addAttributes(ClassHeaderDescription header,
- List constantPool, Map attributes) {
- addGenericAttributes(header, constantPool, attributes);
+ private void addAttributes(ClassHeaderDescription header, ClassBuilder builder) {
+ addGenericAttributes(header, builder);
if (header.nestHost != null) {
- int attributeString = addString(constantPool, Attribute.NestHost);
- int nestHost = addClass(constantPool, header.nestHost);
- attributes.put(Attribute.NestHost,
- new NestHost_attribute(attributeString, nestHost));
+ builder.with(NestHostAttribute.of(ClassDesc.ofInternalName(header.nestHost)));
}
if (header.nestMembers != null && !header.nestMembers.isEmpty()) {
- int attributeString = addString(constantPool, Attribute.NestMembers);
- int[] nestMembers = new int[header.nestMembers.size()];
- int i = 0;
- for (String intf : header.nestMembers) {
- nestMembers[i++] = addClass(constantPool, intf);
- }
- attributes.put(Attribute.NestMembers,
- new NestMembers_attribute(attributeString, nestMembers));
+ builder.with(NestMembersAttribute.ofSymbols(header.nestMembers.stream().map(ClassDesc::ofInternalName).collect(Collectors.toList())));
}
if (header.isRecord) {
- assert header.recordComponents != null;
- int attributeString = addString(constantPool, Attribute.Record);
- ComponentInfo[] recordComponents = new ComponentInfo[header.recordComponents.size()];
- int i = 0;
- for (RecordComponentDescription rcd : header.recordComponents) {
- int name = addString(constantPool, rcd.name);
- Descriptor desc = new Descriptor(addString(constantPool, rcd.descriptor));
- Map nestedAttrs = new HashMap<>();
- addGenericAttributes(rcd, constantPool, nestedAttrs);
- Attributes attrs = new Attributes(nestedAttrs);
- recordComponents[i++] = new ComponentInfo(name, desc, attrs);
- }
- attributes.put(Attribute.Record,
- new Record_attribute(attributeString, recordComponents));
+ builder.with(RecordAttribute.of(header.recordComponents.stream().map(desc -> {
+ List> attributes = new ArrayList<>();
+ addGenericAttributes(desc, attributes::add, builder.constantPool());
+ return RecordComponentInfo.of(desc.name, ClassDesc.ofDescriptor(desc.descriptor), attributes);
+ }).collect(Collectors.toList())));
}
if (header.isSealed) {
- int attributeString = addString(constantPool, Attribute.PermittedSubclasses);
- int[] subclasses = new int[header.permittedSubclasses.size()];
- int i = 0;
- for (String intf : header.permittedSubclasses) {
- subclasses[i++] = addClass(constantPool, intf);
- }
- attributes.put(Attribute.PermittedSubclasses,
- new PermittedSubclasses_attribute(attributeString, subclasses));
+ builder.with(PermittedSubclassesAttribute.ofSymbols(header.permittedSubclasses.stream().map(ClassDesc::ofInternalName).collect(Collectors.toList())));
}
- addInnerClassesAttribute(header, constantPool, attributes);
+ addInnerClassesAttribute(header, builder);
}
- private void addInnerClassesAttribute(HeaderDescription header,
- List constantPool, Map attributes) {
+ private void addInnerClassesAttribute(HeaderDescription header, ClassBuilder builder) {
if (header.innerClasses != null && !header.innerClasses.isEmpty()) {
- Info[] innerClasses = new Info[header.innerClasses.size()];
- int i = 0;
- for (InnerClassInfo info : header.innerClasses) {
- innerClasses[i++] =
- new Info(info.innerClass == null ? 0 : addClass(constantPool, info.innerClass),
- info.outerClass == null ? 0 : addClass(constantPool, info.outerClass),
- info.innerClassName == null ? 0 : addString(constantPool, info.innerClassName),
- new AccessFlags(info.innerClassFlags));
- }
- int attributeString = addString(constantPool, Attribute.InnerClasses);
- attributes.put(Attribute.InnerClasses,
- new InnerClasses_attribute(attributeString, innerClasses));
+ builder.with(InnerClassesAttribute.of(header.innerClasses.stream()
+ .map(info -> java.lang.classfile.attribute.InnerClassInfo.of(
+ ClassDesc.ofInternalName(info.innerClass),
+ Optional.ofNullable(info.outerClass).map(ClassDesc::ofInternalName),
+ Optional.ofNullable(info.innerClassName),
+ info.innerClassFlags
+ )).collect(Collectors.toList())));
}
}
- private void addAttributes(MethodDescription desc, List constantPool, Map attributes) {
- addGenericAttributes(desc, constantPool, attributes);
+ private void addAttributes(MethodDescription desc, MethodBuilder builder) {
+ addGenericAttributes(desc, builder);
if (desc.thrownTypes != null) {
- int[] exceptions = new int[desc.thrownTypes.size()];
- int i = 0;
- for (String exc : desc.thrownTypes) {
- exceptions[i++] = addClass(constantPool, exc);
- }
- int attributeString = addString(constantPool, Attribute.Exceptions);
- attributes.put(Attribute.Exceptions,
- new Exceptions_attribute(attributeString, exceptions));
+ builder.with(ExceptionsAttribute.ofSymbols(desc.thrownTypes.stream()
+ .map(ClassDesc::ofInternalName).collect(Collectors.toList())));
}
if (desc.annotationDefaultValue != null) {
- int attributeString = addString(constantPool, Attribute.AnnotationDefault);
- element_value attributeValue = createAttributeValue(constantPool,
- desc.annotationDefaultValue);
- attributes.put(Attribute.AnnotationDefault,
- new AnnotationDefault_attribute(attributeString, attributeValue));
+ builder.with(AnnotationDefaultAttribute.of(createAttributeValue(desc.annotationDefaultValue)));
}
if (desc.classParameterAnnotations != null && !desc.classParameterAnnotations.isEmpty()) {
- int attributeString =
- addString(constantPool, Attribute.RuntimeInvisibleParameterAnnotations);
- Annotation[][] annotations =
- createParameterAnnotations(constantPool, desc.classParameterAnnotations);
- attributes.put(Attribute.RuntimeInvisibleParameterAnnotations,
- new RuntimeInvisibleParameterAnnotations_attribute(attributeString,
- annotations));
+ builder.with(RuntimeInvisibleParameterAnnotationsAttribute.of(createParameterAnnotations(desc.classParameterAnnotations)));
}
if (desc.runtimeParameterAnnotations != null && !desc.runtimeParameterAnnotations.isEmpty()) {
- int attributeString =
- addString(constantPool, Attribute.RuntimeVisibleParameterAnnotations);
- Annotation[][] annotations =
- createParameterAnnotations(constantPool, desc.runtimeParameterAnnotations);
- attributes.put(Attribute.RuntimeVisibleParameterAnnotations,
- new RuntimeVisibleParameterAnnotations_attribute(attributeString,
- annotations));
+ builder.with(RuntimeVisibleParameterAnnotationsAttribute.of(createParameterAnnotations(desc.runtimeParameterAnnotations)));
}
if (desc.methodParameters != null && !desc.methodParameters.isEmpty()) {
- int attributeString =
- addString(constantPool, Attribute.MethodParameters);
- MethodParameters_attribute.Entry[] entries =
- desc.methodParameters
- .stream()
- .map(p -> new MethodParameters_attribute.Entry(p.name == null || p.name.isEmpty() ? 0
- : addString(constantPool, p.name),
- p.flags))
- .toArray(s -> new MethodParameters_attribute.Entry[s]);
- attributes.put(Attribute.MethodParameters,
- new MethodParameters_attribute(attributeString, entries));
+ builder.with(MethodParametersAttribute.of(desc.methodParameters.stream()
+ .map(mp -> MethodParameterInfo.ofParameter(Optional.ofNullable(mp.name), mp.flags)).collect(Collectors.toList())));
}
}
- private void addAttributes(FieldDescription desc, List constantPool, Map attributes) {
- addGenericAttributes(desc, constantPool, attributes);
+ private void addAttributes(FieldDescription desc, FieldBuilder builder) {
+ addGenericAttributes(desc, builder);
if (desc.constantValue != null) {
- Pair constantPoolEntry =
- addConstant(constantPool, desc.constantValue, false);
- Assert.checkNonNull(constantPoolEntry);
- int constantValueString = addString(constantPool, Attribute.ConstantValue);
- attributes.put(Attribute.ConstantValue,
- new ConstantValue_attribute(constantValueString, constantPoolEntry.fst));
+ var cp = builder.constantPool();
+ ConstantValueEntry entry = switch (desc.constantValue) {
+ case Boolean v -> cp.intEntry(v ? 1 : 0);
+ case Character v -> cp.intEntry(v);
+ case Integer v -> cp.intEntry(v);
+ case Long v -> cp.longEntry(v);
+ case Float v -> cp.floatEntry(v);
+ case Double v -> cp.doubleEntry(v);
+ case String v -> cp.stringEntry(v);
+ default -> throw new IllegalArgumentException(desc.constantValue.getClass().toString());
+ };
+ builder.with(ConstantValueAttribute.of(entry));
}
}
- private void addGenericAttributes(FeatureDescription desc, List constantPool, Map attributes) {
+ @SuppressWarnings("unchecked")
+ private void addGenericAttributes(FeatureDescription desc, ClassFileBuilder, ?> builder) {
+ addGenericAttributes(desc, (Consumer super Attribute>>) builder, builder.constantPool());
+ }
+
+ private void addGenericAttributes(FeatureDescription desc, Consumer super Attribute>> sink, ConstantPoolBuilder cpb) {
+ @SuppressWarnings("unchecked")
+ var builder = (Consumer>) sink;
if (desc.deprecated) {
- int attributeString = addString(constantPool, Attribute.Deprecated);
- attributes.put(Attribute.Deprecated,
- new Deprecated_attribute(attributeString));
+ builder.accept(DeprecatedAttribute.of());
}
if (desc.signature != null) {
- int attributeString = addString(constantPool, Attribute.Signature);
- int signatureString = addString(constantPool, desc.signature);
- attributes.put(Attribute.Signature,
- new Signature_attribute(attributeString, signatureString));
+ builder.accept(SignatureAttribute.of(cpb.utf8Entry(desc.signature)));
}
if (desc.classAnnotations != null && !desc.classAnnotations.isEmpty()) {
- int attributeString = addString(constantPool, Attribute.RuntimeInvisibleAnnotations);
- Annotation[] annotations = createAnnotations(constantPool, desc.classAnnotations);
- attributes.put(Attribute.RuntimeInvisibleAnnotations,
- new RuntimeInvisibleAnnotations_attribute(attributeString, annotations));
+ builder.accept(RuntimeInvisibleAnnotationsAttribute.of(createAnnotations(desc.classAnnotations)));
}
if (desc.runtimeAnnotations != null && !desc.runtimeAnnotations.isEmpty()) {
- int attributeString = addString(constantPool, Attribute.RuntimeVisibleAnnotations);
- Annotation[] annotations = createAnnotations(constantPool, desc.runtimeAnnotations);
- attributes.put(Attribute.RuntimeVisibleAnnotations,
- new RuntimeVisibleAnnotations_attribute(attributeString, annotations));
+ builder.accept(RuntimeVisibleAnnotationsAttribute.of(createAnnotations(desc.runtimeAnnotations)));
}
}
- private Annotation[] createAnnotations(List constantPool, List desc) {
- Annotation[] result = new Annotation[desc.size()];
- int i = 0;
-
- for (AnnotationDescription ad : desc) {
- result[i++] = createAnnotation(constantPool, ad);
- }
-
- return result;
+ private List createAnnotations(List desc) {
+ return desc.stream().map(this::createAnnotation).collect(Collectors.toList());
}
- private Annotation[][] createParameterAnnotations(List constantPool, List> desc) {
- Annotation[][] result = new Annotation[desc.size()][];
- int i = 0;
-
- for (List paramAnnos : desc) {
- result[i++] = createAnnotations(constantPool, paramAnnos);
- }
-
- return result;
+ private List> createParameterAnnotations(List> desc) {
+ return desc.stream().map(this::createAnnotations).collect(Collectors.toList());
}
- private Annotation createAnnotation(List constantPool, AnnotationDescription desc) {
+ private Annotation createAnnotation(AnnotationDescription desc) {
String annotationType = desc.annotationType;
Map values = desc.values;
@@ -1257,184 +1027,33 @@ private Annotation createAnnotation(List constantPool, AnnotationDescrip
annotationType = RESTRICTED_ANNOTATION_INTERNAL;
}
- return new Annotation(null,
- addString(constantPool, annotationType),
- createElementPairs(constantPool, values));
- }
-
- private element_value_pair[] createElementPairs(List constantPool, Map annotationAttributes) {
- element_value_pair[] pairs = new element_value_pair[annotationAttributes.size()];
- int i = 0;
-
- for (Entry e : annotationAttributes.entrySet()) {
- int elementNameString = addString(constantPool, e.getKey());
- element_value value = createAttributeValue(constantPool, e.getValue());
- pairs[i++] = new element_value_pair(elementNameString, value);
- }
-
- return pairs;
+ return Annotation.of(ClassDesc.ofDescriptor(annotationType),
+ createElementPairs(values));
}
- private element_value createAttributeValue(List constantPool, Object value) {
- Pair constantPoolEntry = addConstant(constantPool, value, true);
- if (constantPoolEntry != null) {
- return new Primitive_element_value(constantPoolEntry.fst, constantPoolEntry.snd);
- } else if (value instanceof EnumConstant) {
- EnumConstant ec = (EnumConstant) value;
- return new Enum_element_value(addString(constantPool, ec.type),
- addString(constantPool, ec.constant),
- 'e');
- } else if (value instanceof ClassConstant) {
- ClassConstant cc = (ClassConstant) value;
- return new Class_element_value(addString(constantPool, cc.type), 'c');
- } else if (value instanceof AnnotationDescription) {
- Annotation annotation = createAnnotation(constantPool, ((AnnotationDescription) value));
- return new Annotation_element_value(annotation, '@');
- } else if (value instanceof Collection) {
- @SuppressWarnings("unchecked")
- Collection array = (Collection) value;
- element_value[] values = new element_value[array.size()];
- int i = 0;
-
- for (Object elem : array) {
- values[i++] = createAttributeValue(constantPool, elem);
- }
-
- return new Array_element_value(values, '[');
- }
- throw new IllegalStateException(value.getClass().getName());
+ private List createElementPairs(Map annotationAttributes) {
+ return annotationAttributes.entrySet().stream()
+ .map(e -> AnnotationElement.of(e.getKey(), createAttributeValue(e.getValue())))
+ .collect(Collectors.toList());
}
- private static Pair addConstant(List constantPool, Object value, boolean annotation) {
- if (value instanceof Boolean) {
- return Pair.of(addToCP(constantPool, new CONSTANT_Integer_info(((Boolean) value) ? 1 : 0)), 'Z');
- } else if (value instanceof Byte) {
- return Pair.of(addToCP(constantPool, new CONSTANT_Integer_info((byte) value)), 'B');
- } else if (value instanceof Character) {
- return Pair.of(addToCP(constantPool, new CONSTANT_Integer_info((char) value)), 'C');
- } else if (value instanceof Short) {
- return Pair.of(addToCP(constantPool, new CONSTANT_Integer_info((short) value)), 'S');
- } else if (value instanceof Integer) {
- return Pair.of(addToCP(constantPool, new CONSTANT_Integer_info((int) value)), 'I');
- } else if (value instanceof Long) {
- return Pair.of(addToCP(constantPool, new CONSTANT_Long_info((long) value)), 'J');
- } else if (value instanceof Float) {
- return Pair.of(addToCP(constantPool, new CONSTANT_Float_info((float) value)), 'F');
- } else if (value instanceof Double) {
- return Pair.of(addToCP(constantPool, new CONSTANT_Double_info((double) value)), 'D');
- } else if (value instanceof String) {
- int stringIndex = addString(constantPool, (String) value);
- if (annotation) {
- return Pair.of(stringIndex, 's');
- } else {
- return Pair.of(addToCP(constantPool, new CONSTANT_String_info(null, stringIndex)), 's');
- }
- }
-
- return null;
- }
-
- private static int addString(List constantPool, String string) {
- Assert.checkNonNull(string);
-
- int i = 0;
- for (CPInfo info : constantPool) {
- if (info instanceof CONSTANT_Utf8_info) {
- if (((CONSTANT_Utf8_info) info).value.equals(string)) {
- return i;
- }
- }
- i++;
- }
-
- return addToCP(constantPool, new CONSTANT_Utf8_info(string));
- }
-
- private static int addInt(List constantPool, int value) {
- int i = 0;
- for (CPInfo info : constantPool) {
- if (info instanceof CONSTANT_Integer_info) {
- if (((CONSTANT_Integer_info) info).value == value) {
- return i;
- }
- }
- i++;
- }
-
- return addToCP(constantPool, new CONSTANT_Integer_info(value));
- }
-
- private static int addModuleName(List constantPool, String moduleName) {
- int nameIdx = addString(constantPool, moduleName);
- int i = 0;
- for (CPInfo info : constantPool) {
- if (info instanceof CONSTANT_Module_info) {
- if (((CONSTANT_Module_info) info).name_index == nameIdx) {
- return i;
- }
- }
- i++;
- }
-
- return addToCP(constantPool, new CONSTANT_Module_info(null, nameIdx));
- }
-
- private static int addPackageName(List constantPool, String packageName) {
- int nameIdx = addString(constantPool, packageName);
- int i = 0;
- for (CPInfo info : constantPool) {
- if (info instanceof CONSTANT_Package_info) {
- if (((CONSTANT_Package_info) info).name_index == nameIdx) {
- return i;
- }
- }
- i++;
- }
-
- return addToCP(constantPool, new CONSTANT_Package_info(null, nameIdx));
- }
-
- private static int addClassName(List constantPool, String className) {
- int nameIdx = addString(constantPool, className);
- int i = 0;
- for (CPInfo info : constantPool) {
- if (info instanceof CONSTANT_Class_info) {
- if (((CONSTANT_Class_info) info).name_index == nameIdx) {
- return i;
- }
- }
- i++;
- }
-
- return addToCP(constantPool, new CONSTANT_Class_info(null, nameIdx));
- }
-
- private static int addToCP(List constantPool, CPInfo entry) {
- int result = constantPool.size();
-
- constantPool.add(entry);
-
- if (entry.size() > 1) {
- constantPool.add(null);
- }
-
- return result;
- }
-
- private static int addClass(List constantPool, String className) {
- int classNameIndex = addString(constantPool, className);
-
- int i = 0;
- for (CPInfo info : constantPool) {
- if (info instanceof CONSTANT_Class_info) {
- if (((CONSTANT_Class_info) info).name_index == classNameIndex) {
- return i;
- }
- }
- i++;
- }
-
- return addToCP(constantPool, new CONSTANT_Class_info(null, classNameIndex));
+ private AnnotationValue createAttributeValue(Object value) {
+ return switch (value) {
+ case Boolean v -> AnnotationValue.ofBoolean(v);
+ case Byte v -> AnnotationValue.ofByte(v);
+ case Character v -> AnnotationValue.ofChar(v);
+ case Short v -> AnnotationValue.ofShort(v);
+ case Integer v -> AnnotationValue.ofInt(v);
+ case Long v -> AnnotationValue.ofLong(v);
+ case Float v -> AnnotationValue.ofFloat(v);
+ case Double v -> AnnotationValue.ofDouble(v);
+ case String v -> AnnotationValue.ofString(v);
+ case EnumConstant v -> AnnotationValue.ofEnum(ClassDesc.ofDescriptor(v.type), v.constant);
+ case ClassConstant v -> AnnotationValue.ofClass(ClassDesc.ofDescriptor(v.type));
+ case AnnotationDescription v -> AnnotationValue.ofAnnotation(createAnnotation(v));
+ case Collection> v -> AnnotationValue.ofArray(v.stream().map(this::createAttributeValue).collect(Collectors.toList()));
+ default -> throw new IllegalArgumentException(value.getClass().getName());
+ };
}
//
//
@@ -1509,7 +1128,7 @@ private Iterable loadClassData(String path) {
classFileData.add(data.toByteArray());
}
} catch (IOException ex) {
- throw new IllegalStateException(ex);
+ throw new IllegalArgumentException(ex);
}
return classFileData;
@@ -1525,12 +1144,8 @@ private void loadVersionClasses(ClassList classes,
new HashMap<>();
for (byte[] classFileData : classData) {
- try (InputStream in = new ByteArrayInputStream(classFileData)) {
- inspectModuleInfoClassFile(in,
- currentVersionModules, version);
- } catch (IOException | ConstantPoolException ex) {
- throw new IllegalStateException(ex);
- }
+ inspectModuleInfoClassFile(classFileData,
+ currentVersionModules, version);
}
ExcludeIncludeList currentEIList;
@@ -1561,28 +1176,25 @@ private void loadVersionClasses(ClassList classes,
try (InputStream in = new ByteArrayInputStream(classFileData)) {
inspectClassFile(in, currentVersionClasses,
currentEIList, version,
- cf -> {
- PermittedSubclasses_attribute permitted = (PermittedSubclasses_attribute) cf.getAttribute(Attribute.PermittedSubclasses);
+ cm -> {
+ var permitted = cm.findAttribute(Attributes.permittedSubclasses()).orElse(null);
if (permitted != null) {
- try {
- String currentPack = cf.getName().substring(0, cf.getName().lastIndexOf('/'));
+ var name = cm.thisClass().asInternalName();
+ String currentPack = name.substring(0, name.lastIndexOf('/'));
- for (int i = 0; i < permitted.subtypes.length; i++) {
- String permittedClassName = cf.constant_pool.getClassInfo(permitted.subtypes[i]).getName();
- if (!currentEIList.accepts(permittedClassName, false)) {
- String permittedPack = permittedClassName.substring(0, permittedClassName.lastIndexOf('/'));
+ for (var sub : permitted.permittedSubclasses()) {
+ String permittedClassName = sub.asInternalName();
+ if (!currentEIList.accepts(permittedClassName, false)) {
+ String permittedPack = permittedClassName.substring(0, permittedClassName.lastIndexOf('/'));
- extraModulesPackagesToDerive.computeIfAbsent(permittedPack, x -> new HashSet<>())
- .add(currentPack);
- }
+ extraModulesPackagesToDerive.computeIfAbsent(permittedPack, x -> new HashSet<>())
+ .add(currentPack);
}
- } catch (ConstantPoolException ex) {
- throw new IllegalStateException(ex);
}
}
});
- } catch (IOException | ConstantPoolException ex) {
- throw new IllegalStateException(ex);
+ } catch (IOException ex) {
+ throw new IllegalArgumentException(ex);
}
}
@@ -1648,12 +1260,8 @@ private void loadVersionClassesFromDirectory(ClassList classes,
Path moduleInfo = p.resolve("module-info.class");
if (Files.isReadable(moduleInfo)) {
- ModuleDescription md;
-
- try (InputStream in = Files.newInputStream(moduleInfo)) {
- md = inspectModuleInfoClassFile(in,
+ ModuleDescription md = inspectModuleInfoClassFile(Files.readAllBytes(moduleInfo),
currentVersionModules, version);
- }
if (md == null) {
continue;
}
@@ -1715,8 +1323,8 @@ private void loadVersionClassesFromDirectory(ClassList classes,
}
}
}
- } catch (IOException | ConstantPoolException ex) {
- throw new IllegalStateException(ex);
+ } catch (IOException ex) {
+ throw new IllegalArgumentException(ex);
}
finishClassLoading(classes, modules, currentVersionModules, currentVersionClasses, currentEIList, version, baseline);
@@ -1724,7 +1332,7 @@ private void loadVersionClassesFromDirectory(ClassList classes,
private void loadFromDirectoryHandleClassFile(Path path, ClassList currentVersionClasses,
ExcludeIncludeList currentEIList, String version,
- List todo) throws IOException, ConstantPoolException {
+ List todo) throws IOException {
try (InputStream in = Files.newInputStream(path)) {
inspectClassFile(in, currentVersionClasses,
currentEIList, version,
@@ -2244,45 +1852,41 @@ private JavaFileManager setupJavac(String... options) {
public static String PROFILE_ANNOTATION = "Ljdk/Profile+Annotation;";
public static boolean ALLOW_NON_EXISTING_CLASSES = false;
- private void inspectClassFile(InputStream in, ClassList classes, ExcludeIncludeList excludesIncludes, String version) throws IOException, ConstantPoolException {
+ private void inspectClassFile(InputStream in, ClassList classes, ExcludeIncludeList excludesIncludes, String version) throws IOException {
inspectClassFile(in, classes, excludesIncludes, version, cf -> {});
}
private void inspectClassFile(InputStream in, ClassList classes, ExcludeIncludeList excludesIncludes, String version,
- Consumer extraTask) throws IOException, ConstantPoolException {
- ClassFile cf = ClassFile.read(in);
+ Consumer extraTask) throws IOException {
+ ClassModel cm = ClassFile.of().parse(in.readAllBytes());
- if (cf.access_flags.is(AccessFlags.ACC_MODULE)) {
+ if (cm.isModuleInfo()) {
return ;
}
- if (!excludesIncludes.accepts(cf.getName(), true)) {
+ if (!excludesIncludes.accepts(cm.thisClass().asInternalName(), true)) {
return ;
}
- extraTask.accept(cf);
+ extraTask.accept(cm);
ClassHeaderDescription headerDesc = new ClassHeaderDescription();
- headerDesc.flags = cf.access_flags.flags;
+ headerDesc.flags = cm.flags().flagsMask();
- if (cf.super_class != 0) {
- headerDesc.extendsAttr = cf.getSuperclassName();
- }
- List interfaces = new ArrayList<>();
- for (int i = 0; i < cf.interfaces.length; i++) {
- interfaces.add(cf.getInterfaceName(i));
+ if (cm.superclass().isPresent()) {
+ headerDesc.extendsAttr = cm.superclass().get().asInternalName();
}
- headerDesc.implementsAttr = interfaces;
- for (Attribute attr : cf.attributes) {
- if (!readAttribute(cf, headerDesc, attr))
+ headerDesc.implementsAttr = cm.interfaces().stream().map(ClassEntry::asInternalName).collect(Collectors.toList());
+ for (var attr : cm.attributes()) {
+ if (!readAttribute(headerDesc, attr))
return ;
}
ClassDescription clazzDesc = null;
for (ClassDescription cd : classes) {
- if (cd.name.equals(cf.getName())) {
+ if (cd.name.equals(cm.thisClass().asInternalName())) {
clazzDesc = cd;
break;
}
@@ -2290,54 +1894,54 @@ private void inspectClassFile(InputStream in, ClassList classes, ExcludeIncludeL
if (clazzDesc == null) {
clazzDesc = new ClassDescription();
- clazzDesc.name = cf.getName();
+ clazzDesc.name = cm.thisClass().asInternalName();
classes.add(clazzDesc);
}
addClassHeader(clazzDesc, headerDesc, version, null);
- for (Method m : cf.methods) {
- if (!include(m.access_flags.flags))
+ for (var m : cm.methods()) {
+ if (!include(m.flags().flagsMask()))
continue;
MethodDescription methDesc = new MethodDescription();
- methDesc.flags = m.access_flags.flags;
- methDesc.name = m.getName(cf.constant_pool);
- methDesc.descriptor = m.descriptor.getValue(cf.constant_pool);
- for (Attribute attr : m.attributes) {
- readAttribute(cf, methDesc, attr);
+ methDesc.flags = m.flags().flagsMask();
+ methDesc.name = m.methodName().stringValue();
+ methDesc.descriptor = m.methodType().stringValue();
+ for (var attr : m.attributes()) {
+ readAttribute(methDesc, attr);
}
addMethod(clazzDesc, methDesc, version, null);
}
- for (Field f : cf.fields) {
- if (!include(f.access_flags.flags))
+ for (var f : cm.fields()) {
+ if (!include(f.flags().flagsMask()))
continue;
FieldDescription fieldDesc = new FieldDescription();
- fieldDesc.flags = f.access_flags.flags;
- fieldDesc.name = f.getName(cf.constant_pool);
- fieldDesc.descriptor = f.descriptor.getValue(cf.constant_pool);
- for (Attribute attr : f.attributes) {
- readAttribute(cf, fieldDesc, attr);
+ fieldDesc.flags = f.flags().flagsMask();
+ fieldDesc.name = f.fieldName().stringValue();
+ fieldDesc.descriptor = f.fieldType().stringValue();
+ for (var attr : f.attributes()) {
+ readAttribute(fieldDesc, attr);
}
addField(clazzDesc, fieldDesc, version, null);
}
}
- private ModuleDescription inspectModuleInfoClassFile(InputStream in,
+ private ModuleDescription inspectModuleInfoClassFile(byte[] data,
Map modules,
- String version) throws IOException, ConstantPoolException {
- ClassFile cf = ClassFile.read(in);
+ String version) {
+ ClassModel cm = ClassFile.of().parse(data);
- if (!cf.access_flags.is(AccessFlags.ACC_MODULE)) {
+ if (!cm.flags().has(AccessFlag.MODULE)) {
return null;
}
ModuleHeaderDescription headerDesc = new ModuleHeaderDescription();
headerDesc.versions = version;
- headerDesc.flags = cf.access_flags.flags;
+ headerDesc.flags = cm.flags().flagsMask();
- for (Attribute attr : cf.attributes) {
- if (!readAttribute(cf, headerDesc, attr))
+ for (var attr : cm.attributes()) {
+ if (!readAttribute(headerDesc, attr))
return null;
}
@@ -2361,56 +1965,46 @@ private Set enhancedIncludesListBasedOnClassHeaders(ClassList classes,
Set additionalIncludes = new HashSet<>();
for (byte[] classFileData : classData) {
- try (InputStream in = new ByteArrayInputStream(classFileData)) {
- ClassFile cf = ClassFile.read(in);
-
- additionalIncludes.addAll(otherRelevantTypesWithOwners(cf));
- } catch (IOException | ConstantPoolException ex) {
- throw new IllegalStateException(ex);
- }
+ additionalIncludes.addAll(otherRelevantTypesWithOwners(ClassFile.of().parse(classFileData)));
}
return additionalIncludes;
}
- private Set otherRelevantTypesWithOwners(ClassFile cf) {
+ private Set otherRelevantTypesWithOwners(ClassModel cm) {
Set supertypes = new HashSet<>();
- try {
- if (cf.access_flags.is(AccessFlags.ACC_MODULE)) {
- return supertypes;
- }
+ if (cm.flags().has(AccessFlag.MODULE)) {
+ return supertypes;
+ }
- Set additionalClasses = new HashSet<>();
+ Set additionalClasses = new HashSet<>();
- if (cf.super_class != 0) {
- additionalClasses.add(cf.getSuperclassName());
- }
- for (int i = 0; i < cf.interfaces.length; i++) {
- additionalClasses.add(cf.getInterfaceName(i));
- }
- PermittedSubclasses_attribute permitted = (PermittedSubclasses_attribute) cf.getAttribute(Attribute.PermittedSubclasses);
- if (permitted != null) {
- for (int i = 0; i < permitted.subtypes.length; i++) {
- additionalClasses.add(cf.constant_pool.getClassInfo(permitted.subtypes[i]).getName());
- }
+ if (cm.superclass().isPresent()) {
+ additionalClasses.add(cm.superclass().get().asInternalName());
+ }
+ for (var iface : cm.interfaces()) {
+ additionalClasses.add(iface.asInternalName());
+ }
+ var permitted = cm.findAttribute(Attributes.permittedSubclasses()).orElse(null);
+ if (permitted != null) {
+ for (var sub : permitted.permittedSubclasses()) {
+ additionalClasses.add(sub.asInternalName());
}
+ }
- for (String additional : additionalClasses) {
- int dollar;
+ for (String additional : additionalClasses) {
+ int dollar;
- supertypes.add(additional);
+ supertypes.add(additional);
- while ((dollar = additional.lastIndexOf('$')) != (-1)) {
- additional = additional.substring(0, dollar);
- supertypes.add(additional);
- }
+ while ((dollar = additional.lastIndexOf('$')) != (-1)) {
+ additional = additional.substring(0, dollar);
+ supertypes.add(additional);
}
-
- return supertypes;
- } catch (ConstantPoolException ex) {
- throw new IllegalStateException(ex);
}
+
+ return supertypes;
}
private void addModuleHeader(ModuleDescription moduleDesc,
@@ -2435,7 +2029,7 @@ private void addModuleHeader(ModuleDescription moduleDesc,
}
private boolean include(int accessFlags) {
- return (accessFlags & (AccessFlags.ACC_PUBLIC | AccessFlags.ACC_PROTECTED)) != 0;
+ return (accessFlags & (ACC_PUBLIC | ACC_PROTECTED)) != 0;
}
private void addClassHeader(ClassDescription clazzDesc, ClassHeaderDescription headerDesc, String version, String baseline) {
@@ -2531,367 +2125,135 @@ private void addField(ClassDescription clazzDesc, FieldDescription fieldDesc, St
}
}
- private boolean readAttribute(ClassFile cf, FeatureDescription feature, Attribute attr) throws ConstantPoolException {
- String attrName = attr.getName(cf.constant_pool);
- switch (attrName) {
- case Attribute.AnnotationDefault:
- assert feature instanceof MethodDescription;
- element_value defaultValue = ((AnnotationDefault_attribute) attr).default_value;
- ((MethodDescription) feature).annotationDefaultValue =
- convertElementValue(cf.constant_pool, defaultValue);
- break;
- case "Deprecated":
- feature.deprecated = true;
- break;
- case "Exceptions":
- assert feature instanceof MethodDescription;
- List thrownTypes = new ArrayList<>();
- Exceptions_attribute exceptionAttr = (Exceptions_attribute) attr;
- for (int i = 0; i < exceptionAttr.exception_index_table.length; i++) {
- thrownTypes.add(exceptionAttr.getException(i, cf.constant_pool));
- }
- ((MethodDescription) feature).thrownTypes = thrownTypes;
- break;
- case Attribute.InnerClasses:
+ private boolean readAttribute(FeatureDescription feature, Attribute> attr) {
+ switch (attr) {
+ case AnnotationDefaultAttribute a ->
+ ((MethodDescription) feature).annotationDefaultValue = convertElementValue(a.defaultValue());
+ case DeprecatedAttribute _ -> feature.deprecated = true;
+ case ExceptionsAttribute a -> ((MethodDescription) feature).thrownTypes = a.exceptions().stream().map(ClassEntry::asInternalName).collect(Collectors.toList());
+ case InnerClassesAttribute a -> {
if (feature instanceof ModuleHeaderDescription)
break; //XXX
- assert feature instanceof ClassHeaderDescription;
- List innerClasses = new ArrayList<>();
- InnerClasses_attribute innerClassesAttr = (InnerClasses_attribute) attr;
- for (int i = 0; i < innerClassesAttr.number_of_classes; i++) {
- CONSTANT_Class_info outerClassInfo =
- innerClassesAttr.classes[i].getOuterClassInfo(cf.constant_pool);
- InnerClassInfo info = new InnerClassInfo();
- CONSTANT_Class_info innerClassInfo =
- innerClassesAttr.classes[i].getInnerClassInfo(cf.constant_pool);
- info.innerClass = innerClassInfo != null ? innerClassInfo.getName() : null;
- info.outerClass = outerClassInfo != null ? outerClassInfo.getName() : null;
- info.innerClassName = innerClassesAttr.classes[i].getInnerName(cf.constant_pool);
- info.innerClassFlags = innerClassesAttr.classes[i].inner_class_access_flags.flags;
- innerClasses.add(info);
- }
- ((ClassHeaderDescription) feature).innerClasses = innerClasses;
- break;
- case "RuntimeInvisibleAnnotations":
- feature.classAnnotations = annotations2Description(cf.constant_pool, attr);
- break;
- case "RuntimeVisibleAnnotations":
- feature.runtimeAnnotations = annotations2Description(cf.constant_pool, attr);
- break;
- case "Signature":
- feature.signature = ((Signature_attribute) attr).getSignature(cf.constant_pool);
- break;
- case "ConstantValue":
- assert feature instanceof FieldDescription;
- Object value = convertConstantValue(cf.constant_pool.get(((ConstantValue_attribute) attr).constantvalue_index), ((FieldDescription) feature).descriptor);
- if (((FieldDescription) feature).descriptor.equals("C")) {
- value = (char) (int) value;
- }
- ((FieldDescription) feature).constantValue = value;
- break;
- case "SourceFile":
- //ignore, not needed
- break;
- case "BootstrapMethods":
- //ignore, not needed
- break;
- case "Code":
- //ignore, not needed
- break;
- case "EnclosingMethod":
+ ((ClassHeaderDescription) feature).innerClasses = a.classes().stream().map(cfi -> {
+ var info = new InnerClassInfo();
+ info.innerClass = cfi.innerClass().asInternalName();
+ info.outerClass = cfi.outerClass().map(ClassEntry::asInternalName).orElse(null);
+ info.innerClassName = cfi.innerName().map(Utf8Entry::stringValue).orElse(null);
+ info.innerClassFlags = cfi.flagsMask();
+ return info;
+ }).collect(Collectors.toList());
+ }
+ case RuntimeInvisibleAnnotationsAttribute a -> feature.classAnnotations = annotations2Description(a.annotations());
+ case RuntimeVisibleAnnotationsAttribute a -> feature.runtimeAnnotations = annotations2Description(a.annotations());
+ case SignatureAttribute a -> feature.signature = a.signature().stringValue();
+ case ConstantValueAttribute a -> {
+ var f = (FieldDescription) feature;
+ f.constantValue = convertConstantValue(a.constant(), f.descriptor);
+ }
+ case SourceFileAttribute _, BootstrapMethodsAttribute _, CodeAttribute _, SyntheticAttribute _ -> {}
+ case EnclosingMethodAttribute _ -> {
return false;
- case "Synthetic":
- break;
- case "RuntimeVisibleParameterAnnotations":
- assert feature instanceof MethodDescription;
- ((MethodDescription) feature).runtimeParameterAnnotations =
- parameterAnnotations2Description(cf.constant_pool, attr);
- break;
- case "RuntimeInvisibleParameterAnnotations":
- assert feature instanceof MethodDescription;
- ((MethodDescription) feature).classParameterAnnotations =
- parameterAnnotations2Description(cf.constant_pool, attr);
- break;
- case Attribute.Module: {
- assert feature instanceof ModuleHeaderDescription;
- ModuleHeaderDescription header =
- (ModuleHeaderDescription) feature;
- Module_attribute mod = (Module_attribute) attr;
-
- header.name = cf.constant_pool
- .getModuleInfo(mod.module_name)
- .getName();
-
- header.exports =
- Arrays.stream(mod.exports)
- .map(ee -> ExportsDescription.create(cf, ee))
- .collect(Collectors.toList());
+ }
+ case RuntimeVisibleParameterAnnotationsAttribute a -> ((MethodDescription) feature).runtimeParameterAnnotations = parameterAnnotations2Description(a.parameterAnnotations());
+ case RuntimeInvisibleParameterAnnotationsAttribute a -> ((MethodDescription) feature).classParameterAnnotations = parameterAnnotations2Description(a.parameterAnnotations());
+ case ModuleAttribute a -> {
+ ModuleHeaderDescription header = (ModuleHeaderDescription) feature;
+ header.name = a.moduleName().name().stringValue();
+ header.exports = a.exports().stream().map(ExportsDescription::create).collect(Collectors.toList());
if (header.extraModulePackages != null) {
header.exports.forEach(ed -> header.extraModulePackages.remove(ed.packageName()));
}
- header.requires =
- Arrays.stream(mod.requires)
- .map(r -> RequiresDescription.create(cf, r))
- .collect(Collectors.toList());
- header.uses = Arrays.stream(mod.uses_index)
- .mapToObj(use -> getClassName(cf, use))
- .collect(Collectors.toList());
- header.provides =
- Arrays.stream(mod.provides)
- .map(p -> ProvidesDescription.create(cf, p))
- .collect(Collectors.toList());
- break;
- }
- case Attribute.ModuleTarget: {
- assert feature instanceof ModuleHeaderDescription;
- ModuleHeaderDescription header =
- (ModuleHeaderDescription) feature;
- ModuleTarget_attribute mod = (ModuleTarget_attribute) attr;
- if (mod.target_platform_index != 0) {
- header.moduleTarget =
- cf.constant_pool
- .getUTF8Value(mod.target_platform_index);
- }
- break;
- }
- case Attribute.ModuleResolution: {
- assert feature instanceof ModuleHeaderDescription;
- ModuleHeaderDescription header =
- (ModuleHeaderDescription) feature;
- ModuleResolution_attribute mod =
- (ModuleResolution_attribute) attr;
- header.moduleResolution = mod.resolution_flags;
- break;
- }
- case Attribute.ModulePackages:
- assert feature instanceof ModuleHeaderDescription;
- ModuleHeaderDescription header =
- (ModuleHeaderDescription) feature;
- ModulePackages_attribute mod =
- (ModulePackages_attribute) attr;
- header.extraModulePackages = new ArrayList<>();
- for (int i = 0; i < mod.packages_count; i++) {
- String packageName = getPackageName(cf, mod.packages_index[i]);
+ header.requires = a.requires().stream().map(RequiresDescription::create).collect(Collectors.toList());
+ header.uses = a.uses().stream().map(ClassEntry::asInternalName).collect(Collectors.toList());
+ header.provides = a.provides().stream().map(ProvidesDescription::create).collect(Collectors.toList());
+ }
+ case ModuleTargetAttribute a -> ((ModuleHeaderDescription) feature).moduleTarget = a.targetPlatform().stringValue();
+ case ModuleResolutionAttribute a -> ((ModuleHeaderDescription) feature).moduleResolution = a.resolutionFlags();
+ case ModulePackagesAttribute a -> {
+ var header = (ModuleHeaderDescription) feature;
+ header.extraModulePackages = a.packages().stream().mapMulti((packageItem, sink) -> {
+ var packageName = packageItem.name().stringValue();
if (header.exports == null ||
- header.exports.stream().noneMatch(ed -> ed.packageName().equals(packageName))) {
- header.extraModulePackages.add(packageName);
+ header.exports.stream().noneMatch(ed -> ed.packageName().equals(packageName))) {
+ sink.accept(packageName);
}
- }
- break;
- case Attribute.ModuleHashes:
- break;
- case Attribute.NestHost: {
- assert feature instanceof ClassHeaderDescription;
- NestHost_attribute nestHost = (NestHost_attribute) attr;
- ClassHeaderDescription chd = (ClassHeaderDescription) feature;
- chd.nestHost = nestHost.getNestTop(cf.constant_pool).getName();
- break;
- }
- case Attribute.NestMembers: {
- assert feature instanceof ClassHeaderDescription;
- NestMembers_attribute nestMembers = (NestMembers_attribute) attr;
- ClassHeaderDescription chd = (ClassHeaderDescription) feature;
- chd.nestMembers = Arrays.stream(nestMembers.members_indexes)
- .mapToObj(i -> getClassName(cf, i))
- .collect(Collectors.toList());
- break;
+ }).collect(Collectors.toList());
}
- case Attribute.Record: {
- assert feature instanceof ClassHeaderDescription;
- Record_attribute record = (Record_attribute) attr;
- List components = new ArrayList<>();
- for (ComponentInfo info : record.component_info_arr) {
- RecordComponentDescription rcd = new RecordComponentDescription();
- rcd.name = info.getName(cf.constant_pool);
- rcd.descriptor = info.descriptor.getValue(cf.constant_pool);
- for (Attribute nestedAttr : info.attributes) {
- readAttribute(cf, rcd, nestedAttr);
- }
- components.add(rcd);
- }
- ClassHeaderDescription chd = (ClassHeaderDescription) feature;
+ case ModuleHashesAttribute _ -> {}
+ case NestHostAttribute a -> ((ClassHeaderDescription) feature).nestHost = a.nestHost().asInternalName();
+ case NestMembersAttribute a -> ((ClassHeaderDescription) feature).nestMembers = a.nestMembers().stream().map(ClassEntry::asInternalName).collect(Collectors.toList());
+ case RecordAttribute a -> {
+ var chd = (ClassHeaderDescription) feature;
chd.isRecord = true;
- chd.recordComponents = components;
- break;
- }
- case Attribute.MethodParameters: {
- assert feature instanceof MethodDescription;
- MethodParameters_attribute params = (MethodParameters_attribute) attr;
- MethodDescription method = (MethodDescription) feature;
- method.methodParameters = new ArrayList<>();
- for (MethodParameters_attribute.Entry e : params.method_parameter_table) {
- String name = e.name_index == 0 ? null
- : cf.constant_pool.getUTF8Value(e.name_index);
- MethodDescription.MethodParam param =
- new MethodDescription.MethodParam(e.flags, name);
- method.methodParameters.add(param);
- }
- break;
- }
- case Attribute.PermittedSubclasses: {
- assert feature instanceof ClassHeaderDescription;
- PermittedSubclasses_attribute permittedSubclasses = (PermittedSubclasses_attribute) attr;
- ClassHeaderDescription chd = (ClassHeaderDescription) feature;
- chd.permittedSubclasses = Arrays.stream(permittedSubclasses.subtypes)
- .mapToObj(i -> getClassName(cf, i))
- .collect(Collectors.toList());
+ chd.recordComponents = a.components().stream().map(rci -> {
+ var rcd = new RecordComponentDescription();
+ rcd.name = rci.name().stringValue();
+ rcd.descriptor = rci.descriptor().stringValue();
+ rci.attributes().forEach(child -> readAttribute(rcd, child));
+ return rcd;
+ }).collect(Collectors.toList());
+ }
+ case MethodParametersAttribute a -> ((MethodDescription) feature).methodParameters = a.parameters().stream()
+ .map(mpi -> new MethodDescription.MethodParam(mpi.flagsMask(), mpi.name().map(Utf8Entry::stringValue).orElse(null)))
+ .collect(Collectors.toList());
+ case PermittedSubclassesAttribute a -> {
+ var chd = (ClassHeaderDescription) feature;
chd.isSealed = true;
- break;
+ chd.permittedSubclasses = a.permittedSubclasses().stream().map(ClassEntry::asInternalName).collect(Collectors.toList());
}
- case Attribute.ModuleMainClass: {
- ModuleMainClass_attribute moduleMainClass = (ModuleMainClass_attribute) attr;
- assert feature instanceof ModuleHeaderDescription;
- ModuleHeaderDescription mhd = (ModuleHeaderDescription) feature;
- mhd.moduleMainClass = moduleMainClass.getMainClassName(cf.constant_pool);
- break;
- }
- default:
- throw new IllegalStateException("Unhandled attribute: " +
- attrName);
+ case ModuleMainClassAttribute a -> ((ModuleHeaderDescription) feature).moduleMainClass = a.mainClass().asInternalName();
+ default -> throw new IllegalArgumentException("Unhandled attribute: " + attr.attributeName()); // Do nothing
}
return true;
}
- private static String getClassName(ClassFile cf, int idx) {
- try {
- return cf.constant_pool.getClassInfo(idx).getName();
- } catch (InvalidIndex ex) {
- throw new IllegalStateException(ex);
- } catch (ConstantPool.UnexpectedEntry ex) {
- throw new IllegalStateException(ex);
- } catch (ConstantPoolException ex) {
- throw new IllegalStateException(ex);
- }
- }
-
- private static String getPackageName(ClassFile cf, int idx) {
- try {
- return cf.constant_pool.getPackageInfo(idx).getName();
- } catch (InvalidIndex ex) {
- throw new IllegalStateException(ex);
- } catch (ConstantPool.UnexpectedEntry ex) {
- throw new IllegalStateException(ex);
- } catch (ConstantPoolException ex) {
- throw new IllegalStateException(ex);
- }
- }
-
- private static String getModuleName(ClassFile cf, int idx) {
- try {
- return cf.constant_pool.getModuleInfo(idx).getName();
- } catch (InvalidIndex ex) {
- throw new IllegalStateException(ex);
- } catch (ConstantPool.UnexpectedEntry ex) {
- throw new IllegalStateException(ex);
- } catch (ConstantPoolException ex) {
- throw new IllegalStateException(ex);
- }
- }
-
public static String INJECTED_VERSION = null;
- private static String getVersion(ClassFile cf, int idx) {
+ private static String getVersion(Optional version) {
if (INJECTED_VERSION != null) {
return INJECTED_VERSION;
}
- if (idx == 0)
- return null;
- try {
- return ((CONSTANT_Utf8_info) cf.constant_pool.get(idx)).value;
- } catch (InvalidIndex ex) {
- throw new IllegalStateException(ex);
- }
+ return version.map(Utf8Entry::stringValue).orElse(null);
}
- Object convertConstantValue(CPInfo info, String descriptor) throws ConstantPoolException {
- if (info instanceof CONSTANT_Integer_info) {
- if ("Z".equals(descriptor))
- return ((CONSTANT_Integer_info) info).value == 1;
- else
- return ((CONSTANT_Integer_info) info).value;
- } else if (info instanceof CONSTANT_Long_info) {
- return ((CONSTANT_Long_info) info).value;
- } else if (info instanceof CONSTANT_Float_info) {
- return ((CONSTANT_Float_info) info).value;
- } else if (info instanceof CONSTANT_Double_info) {
- return ((CONSTANT_Double_info) info).value;
- } else if (info instanceof CONSTANT_String_info) {
- return ((CONSTANT_String_info) info).getString();
- }
- throw new IllegalStateException(info.getClass().getName());
+ Object convertConstantValue(ConstantValueEntry info, String descriptor) {
+ if (descriptor.length() == 1 && info instanceof IntegerEntry ie) {
+ var i = ie.intValue();
+ return switch (descriptor.charAt(0)) {
+ case 'I', 'B', 'S' -> i;
+ case 'C' -> (char) i;
+ case 'Z' -> i == 1;
+ default -> throw new IllegalArgumentException(descriptor);
+ };
+ }
+ return info.constantValue();
}
- Object convertElementValue(ConstantPool cp, element_value val) throws InvalidIndex, ConstantPoolException {
- switch (val.tag) {
- case 'Z':
- return ((CONSTANT_Integer_info) cp.get(((Primitive_element_value) val).const_value_index)).value != 0;
- case 'B':
- return (byte) ((CONSTANT_Integer_info) cp.get(((Primitive_element_value) val).const_value_index)).value;
- case 'C':
- return (char) ((CONSTANT_Integer_info) cp.get(((Primitive_element_value) val).const_value_index)).value;
- case 'S':
- return (short) ((CONSTANT_Integer_info) cp.get(((Primitive_element_value) val).const_value_index)).value;
- case 'I':
- return ((CONSTANT_Integer_info) cp.get(((Primitive_element_value) val).const_value_index)).value;
- case 'J':
- return ((CONSTANT_Long_info) cp.get(((Primitive_element_value) val).const_value_index)).value;
- case 'F':
- return ((CONSTANT_Float_info) cp.get(((Primitive_element_value) val).const_value_index)).value;
- case 'D':
- return ((CONSTANT_Double_info) cp.get(((Primitive_element_value) val).const_value_index)).value;
- case 's':
- return ((CONSTANT_Utf8_info) cp.get(((Primitive_element_value) val).const_value_index)).value;
-
- case 'e':
- return new EnumConstant(cp.getUTF8Value(((Enum_element_value) val).type_name_index),
- cp.getUTF8Value(((Enum_element_value) val).const_name_index));
- case 'c':
- return new ClassConstant(cp.getUTF8Value(((Class_element_value) val).class_info_index));
-
- case '@':
- return annotation2Description(cp, ((Annotation_element_value) val).annotation_value);
-
- case '[':
- List values = new ArrayList<>();
- for (element_value elem : ((Array_element_value) val).values) {
- values.add(convertElementValue(cp, elem));
- }
- return values;
- default:
- throw new IllegalStateException("Currently unhandled tag: " + val.tag);
- }
+ Object convertElementValue(AnnotationValue val) {
+ return switch (val) {
+ case AnnotationValue.OfConstant oc -> oc.resolvedValue();
+ case AnnotationValue.OfEnum oe -> new EnumConstant(oe.className().stringValue(), oe.constantName().stringValue());
+ case AnnotationValue.OfClass oc -> new ClassConstant(oc.className().stringValue());
+ case AnnotationValue.OfArray oa -> oa.values().stream().map(this::convertElementValue).collect(Collectors.toList());
+ case AnnotationValue.OfAnnotation oa -> annotation2Description(oa.annotation());
+ };
}
- private List annotations2Description(ConstantPool cp, Attribute attr) throws ConstantPoolException {
- RuntimeAnnotations_attribute annotationsAttr = (RuntimeAnnotations_attribute) attr;
- List descs = new ArrayList<>();
- for (Annotation a : annotationsAttr.annotations) {
- descs.add(annotation2Description(cp, a));
- }
- return descs;
+ private List annotations2Description(List annos) {
+ return annos.stream().map(this::annotation2Description).collect(Collectors.toList());
}
- private List> parameterAnnotations2Description(ConstantPool cp, Attribute attr) throws ConstantPoolException {
- RuntimeParameterAnnotations_attribute annotationsAttr =
- (RuntimeParameterAnnotations_attribute) attr;
- List> descs = new ArrayList<>();
- for (Annotation[] attrAnnos : annotationsAttr.parameter_annotations) {
- List paramDescs = new ArrayList<>();
- for (Annotation ann : attrAnnos) {
- paramDescs.add(annotation2Description(cp, ann));
- }
- descs.add(paramDescs);
- }
- return descs;
+ private List> parameterAnnotations2Description(List> annos) {
+ return annos.stream().map(this::annotations2Description).collect(Collectors.toList());
}
- private AnnotationDescription annotation2Description(ConstantPool cp, Annotation a) throws ConstantPoolException {
- String annotationType = cp.getUTF8Value(a.type_index);
+ private AnnotationDescription annotation2Description(java.lang.classfile.Annotation a) {
+ String annotationType = a.className().stringValue();
Map values = new HashMap<>();
- for (element_value_pair e : a.element_value_pairs) {
- values.put(cp.getUTF8Value(e.element_name_index), convertElementValue(cp, e.value));
+ for (var e : a.elements()) {
+ values.put(e.name().stringValue(), convertElementValue(e.value()));
}
return new AnnotationDescription(annotationType, values);
@@ -3181,7 +2543,7 @@ public void read(LineBasedReader reader, String baselineVersion,
case "-module":
break OUTER;
default:
- throw new IllegalStateException(reader.lineKey);
+ throw new IllegalArgumentException(reader.lineKey);
}
}
}
@@ -3396,15 +2758,11 @@ public static ExportsDescription deserialize(String data) {
return new ExportsDescription(packageName, to);
}
- public static ExportsDescription create(ClassFile cf,
- ExportsEntry ee) {
- String packageName = getPackageName(cf, ee.exports_index);
+ public static ExportsDescription create(ModuleExportInfo ee) {
+ String packageName = ee.exportedPackage().name().stringValue();
List to = null;
- if (ee.exports_to_count > 0) {
- to = new ArrayList<>();
- for (int moduleIndex : ee.exports_to_index) {
- to.add(getModuleName(cf, moduleIndex));
- }
+ if (!ee.exportsTo().isEmpty()) {
+ to = ee.exportsTo().stream().map(m -> m.name().stringValue()).collect(Collectors.toList());
}
return new ExportsDescription(packageName, to);
}
@@ -3447,12 +2805,11 @@ public static RequiresDescription deserialize(String data) {
ver);
}
- public static RequiresDescription create(ClassFile cf,
- RequiresEntry req) {
- String mod = getModuleName(cf, req.requires_index);
- String ver = getVersion(cf, req.requires_version_index);
+ public static RequiresDescription create(ModuleRequireInfo req) {
+ String mod = req.requires().name().stringValue();
+ String ver = getVersion(req.requiresVersion());
return new RequiresDescription(mod,
- req.requires_flags,
+ req.requiresFlagsMask(),
ver);
}
@@ -3515,13 +2872,9 @@ public static ProvidesDescription deserialize(String data) {
implsList);
}
- public static ProvidesDescription create(ClassFile cf,
- ProvidesEntry prov) {
- String api = getClassName(cf, prov.provides_index);
- List impls =
- Arrays.stream(prov.with_index)
- .mapToObj(wi -> getClassName(cf, wi))
- .collect(Collectors.toList());
+ public static ProvidesDescription create(ModuleProvideInfo prov) {
+ String api = prov.provides().asInternalName();
+ List impls = prov.providesWith().stream().map(ClassEntry::asInternalName).collect(Collectors.toList());
return new ProvidesDescription(api, impls);
}
@@ -3676,7 +3029,7 @@ public void read(LineBasedReader reader, String baselineVersion,
case "-module":
break OUTER;
default:
- throw new IllegalStateException(reader.lineKey);
+ throw new IllegalArgumentException(reader.lineKey);
}
}
}
@@ -4073,7 +3426,7 @@ public MethodParam(int flags, String name) {
static class FieldDescription extends FeatureDescription {
String name;
String descriptor;
- Object constantValue;
+ Object constantValue; // Uses (unsigned) Integer for byte/short
String keyName = "field";
@Override
@@ -4149,7 +3502,7 @@ public boolean read(LineBasedReader reader) throws IOException {
case "D": constantValue = Double.parseDouble(inConstantValue); break;
case "Ljava/lang/String;": constantValue = inConstantValue; break;
default:
- throw new IllegalStateException("Unrecognized field type: " + descriptor);
+ throw new IllegalArgumentException("Unrecognized field type: " + descriptor);
}
}
@@ -4416,7 +3769,7 @@ public ClassDescription find(String name, boolean allowNull) {
if (desc != null || allowNull)
return desc;
- throw new IllegalStateException("Cannot find: " + name);
+ throw new IllegalArgumentException("Cannot find: " + name);
}
private static final ClassDescription NONE = new ClassDescription();
@@ -4565,7 +3918,7 @@ private static Object parseAnnotationValue(String value, int[] valuePointer) {
valuePointer[0] += 5;
return false;
} else {
- throw new IllegalStateException("Unrecognized boolean structure: " + value);
+ throw new IllegalArgumentException("Unrecognized boolean structure: " + value);
}
case 'B': return Byte.parseByte(readDigits(value, valuePointer));
case 'C': return value.charAt(valuePointer[0]++);
@@ -4593,7 +3946,7 @@ private static Object parseAnnotationValue(String value, int[] valuePointer) {
case '@':
return parseAnnotation(value, valuePointer);
default:
- throw new IllegalStateException("Unrecognized signature type: " + value.charAt(valuePointer[0] - 1) + "; value=" + value);
+ throw new IllegalArgumentException("Unrecognized signature type: " + value.charAt(valuePointer[0] - 1) + "; value=" + value);
}
}
diff --git a/make/modules/java.base/Copy.gmk b/make/modules/java.base/Copy.gmk
index e33676529cdf3..4625c1f7dbca1 100644
--- a/make/modules/java.base/Copy.gmk
+++ b/make/modules/java.base/Copy.gmk
@@ -99,7 +99,7 @@ JVMCFG := $(LIB_DST_DIR)/jvm.cfg
define print-cfg-line
$(call LogInfo, Adding -$1 $2 to jvm.cfg)
- $(PRINTF) -- "-$1 $2\n" >> $@ $(NEWLINE)
+ $(ECHO) "-$1 $2" >> $@ $(NEWLINE)
endef
$(JVMCFG): $(call DependOnVariable, ORDERED_CFG_VARIANTS)
diff --git a/make/modules/java.base/Java.gmk b/make/modules/java.base/Java.gmk
index 84344f93409a8..fc09137745662 100644
--- a/make/modules/java.base/Java.gmk
+++ b/make/modules/java.base/Java.gmk
@@ -41,11 +41,6 @@ CLEAN += intrinsic.properties
EXCLUDE_FILES += \
$(TOPDIR)/src/java.base/share/classes/jdk/internal/module/ModuleLoaderMap.java
-EXCLUDES += java/lang/doc-files \
- java/lang/classfile/snippet-files \
- java/lang/classfile/components/snippet-files \
- java/lang/foreign/snippet-files
-
# Exclude BreakIterator classes that are just used in compile process to generate
# data files and shouldn't go in the product
EXCLUDE_FILES += sun/text/resources/BreakIteratorRules.java
diff --git a/make/modules/java.base/Lib.gmk b/make/modules/java.base/Lib.gmk
index 84ee309dadd11..51d323a0344f2 100644
--- a/make/modules/java.base/Lib.gmk
+++ b/make/modules/java.base/Lib.gmk
@@ -158,6 +158,7 @@ endif
$(eval $(call SetupJdkLibrary, BUILD_LIBSYSLOOKUP, \
NAME := syslookup, \
+ EXTRA_HEADER_DIRS := java.base:libjava, \
LD_SET_ORIGIN := false, \
LDFLAGS_linux := -Wl$(COMMA)--no-as-needed, \
LDFLAGS_aix := -brtl -bexpfull, \
diff --git a/make/modules/java.base/gensrc/GensrcBuffer.gmk b/make/modules/java.base/gensrc/GensrcBuffer.gmk
index f769a8e61e052..dd91c8c870a13 100644
--- a/make/modules/java.base/gensrc/GensrcBuffer.gmk
+++ b/make/modules/java.base/gensrc/GensrcBuffer.gmk
@@ -272,7 +272,7 @@ define SetupGenBuffer
$$($1_long_CMD) -i$$($1_SRC_BIN) -o$$($1_DST).tmp
$$($1_float_CMD) -i$$($1_SRC_BIN) -o$$($1_DST).tmp
$$($1_double_CMD) -i$$($1_SRC_BIN) -o$$($1_DST).tmp
- $(PRINTF) "}\n" >> $$($1_DST).tmp
+ $(ECHO) "}" >> $$($1_DST).tmp
mv $$($1_DST).tmp $$($1_DST)
endif
diff --git a/make/modules/java.base/gensrc/GensrcScopedMemoryAccess.gmk b/make/modules/java.base/gensrc/GensrcScopedMemoryAccess.gmk
index 00fba64394b7b..ea51e4fd4ee12 100644
--- a/make/modules/java.base/gensrc/GensrcScopedMemoryAccess.gmk
+++ b/make/modules/java.base/gensrc/GensrcScopedMemoryAccess.gmk
@@ -163,7 +163,7 @@ $(SCOPED_MEMORY_ACCESS_DEST): $(BUILD_TOOLS_JDK) $(SCOPED_MEMORY_ACCESS_TEMPLATE
$(foreach t, $(SCOPE_MEMORY_ACCESS_TYPES), \
$(TOOL_SPP) -nel -K$(BIN_$t_type) -Dtype=$(BIN_$t_type) -DType=$(BIN_$t_Type) $(BIN_$t_ARGS) \
-i$(SCOPED_MEMORY_ACCESS_BIN_TEMPLATE) -o$(SCOPED_MEMORY_ACCESS_DEST) ;)
- $(PRINTF) "}\n" >> $(SCOPED_MEMORY_ACCESS_DEST)
+ $(ECHO) "}" >> $(SCOPED_MEMORY_ACCESS_DEST)
TARGETS += $(SCOPED_MEMORY_ACCESS_DEST)
diff --git a/make/modules/java.compiler/Java.gmk b/make/modules/java.compiler/Java.gmk
index cb720672639f5..d0a1fbf4cd510 100644
--- a/make/modules/java.compiler/Java.gmk
+++ b/make/modules/java.compiler/Java.gmk
@@ -32,6 +32,4 @@
DOCLINT += -Xdoclint:all/protected \
'-Xdoclint/package:java.*,javax.*'
-EXCLUDES += javax/tools/snippet-files
-
################################################################################
diff --git a/make/modules/java.desktop/Java.gmk b/make/modules/java.desktop/Java.gmk
index 61c7fa44e0e83..bab6186fb0d0b 100644
--- a/make/modules/java.desktop/Java.gmk
+++ b/make/modules/java.desktop/Java.gmk
@@ -31,15 +31,6 @@ DOCLINT += -Xdoclint:all/protected \
COPY += .gif .png .wav .txt .xml .css .pf
CLEAN += iio-plugin.properties cursors.properties
-EXCLUDES += \
- java/awt/doc-files \
- javax/swing/doc-files \
- javax/swing/text/doc-files \
- javax/swing/plaf/synth/doc-files \
- javax/swing/undo/doc-files \
- sun/awt/X11/doc-files \
- #
-
EXCLUDE_FILES += \
javax/swing/plaf/nimbus/InternalFrameTitlePanePainter.java \
javax/swing/plaf/nimbus/OptionPaneMessageAreaPainter.java \
diff --git a/make/modules/java.desktop/lib/ClientLibraries.gmk b/make/modules/java.desktop/lib/ClientLibraries.gmk
index a9511fba58ba0..dcb41defba354 100644
--- a/make/modules/java.desktop/lib/ClientLibraries.gmk
+++ b/make/modules/java.desktop/lib/ClientLibraries.gmk
@@ -87,7 +87,7 @@ $(eval $(call SetupJdkLibrary, BUILD_LIBLCMS, \
libawt/java2d \
java.base:libjvm, \
HEADERS_FROM_SRC := $(LIBLCMS_HEADERS_FROM_SRC), \
- DISABLED_WARNINGS_gcc := format-nonliteral stringop-truncation type-limits \
+ DISABLED_WARNINGS_gcc := format-nonliteral stringop-truncation \
unused-variable, \
DISABLED_WARNINGS_clang := format-nonliteral, \
JDK_LIBS := libawt java.base:libjava, \
diff --git a/make/modules/jdk.compiler/Gendata.gmk b/make/modules/jdk.compiler/Gendata.gmk
index 739625a5732a1..2afc6e98e3791 100644
--- a/make/modules/jdk.compiler/Gendata.gmk
+++ b/make/modules/jdk.compiler/Gendata.gmk
@@ -53,7 +53,6 @@ COMPILECREATESYMBOLS_ADD_EXPORTS := \
--add-exports jdk.compiler/com.sun.tools.javac.code=ALL-UNNAMED \
--add-exports jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED \
--add-exports jdk.compiler/com.sun.tools.javac.jvm=ALL-UNNAMED \
- --add-exports jdk.jdeps/com.sun.tools.classfile=ALL-UNNAMED \
#
# TODO: Unify with jdk.javadoc-gendata. Should only compile this once and share.
@@ -61,7 +60,7 @@ $(eval $(call SetupJavaCompilation, COMPILE_CREATE_SYMBOLS, \
TARGET_RELEASE := $(TARGET_RELEASE_NEWJDK), \
COMPILER := buildjdk, \
SRC := $(TOPDIR)/make/langtools/src/classes, \
- INCLUDES := build/tools/symbolgenerator com/sun/tools/classfile, \
+ INCLUDES := build/tools/symbolgenerator, \
BIN := $(BUILDTOOLS_OUTPUTDIR)/create_symbols_javac, \
DISABLED_WARNINGS := options, \
JAVAC_FLAGS := \
diff --git a/make/modules/jdk.javadoc/Gendata.gmk b/make/modules/jdk.javadoc/Gendata.gmk
index 2cd812de779ef..a97342ffd049e 100644
--- a/make/modules/jdk.javadoc/Gendata.gmk
+++ b/make/modules/jdk.javadoc/Gendata.gmk
@@ -51,7 +51,7 @@ $(eval $(call SetupJavaCompilation, COMPILE_CREATE_SYMBOLS, \
TARGET_RELEASE := $(TARGET_RELEASE_BOOTJDK), \
SRC := $(TOPDIR)/make/langtools/src/classes \
$(TOPDIR)/src/jdk.jdeps/share/classes, \
- INCLUDES := build/tools/symbolgenerator com/sun/tools/classfile, \
+ INCLUDES := build/tools/symbolgenerator, \
BIN := $(BUILDTOOLS_OUTPUTDIR)/create_symbols_javadoc, \
DISABLED_WARNINGS := options, \
JAVAC_FLAGS := \
diff --git a/make/modules/jdk.jdi/Java.gmk b/make/modules/jdk.jdi/Java.gmk
index d31008c318fdc..c5e3e715c1a68 100644
--- a/make/modules/jdk.jdi/Java.gmk
+++ b/make/modules/jdk.jdi/Java.gmk
@@ -31,7 +31,6 @@ EXCLUDES += \
com/sun/tools/example/debug/bdi \
com/sun/tools/example/debug/event \
com/sun/tools/example/debug/gui \
- com/sun/jdi/doc-files \
#
EXCLUDE_FILES += jdi-overview.html
diff --git a/make/modules/jdk.jlink/Java.gmk b/make/modules/jdk.jlink/Java.gmk
new file mode 100644
index 0000000000000..4ddd1eab03d92
--- /dev/null
+++ b/make/modules/jdk.jlink/Java.gmk
@@ -0,0 +1,32 @@
+#
+# Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
+# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+#
+# This code is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License version 2 only, as
+# published by the Free Software Foundation. Oracle designates this
+# particular file as subject to the "Classpath" exception as provided
+# by Oracle in the LICENSE file that accompanied this code.
+#
+# This code is distributed in the hope that it will be useful, but WITHOUT
+# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+# version 2 for more details (a copy is included in the LICENSE file that
+# accompanied this code).
+#
+# You should have received a copy of the GNU General Public License version
+# 2 along with this work; if not, write to the Free Software Foundation,
+# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
+#
+# Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
+# or visit www.oracle.com if you need additional information or have any
+# questions.
+#
+
+################################################################################
+
+# Instruct SetupJavaCompilation for the jdk.jlink module to include
+# upgrade_files_.conf files
+COPY += .conf
+
+################################################################################
diff --git a/make/scripts/compare.sh b/make/scripts/compare.sh
index cc2f4adf2ed8b..250b5a37b9f5f 100644
--- a/make/scripts/compare.sh
+++ b/make/scripts/compare.sh
@@ -203,12 +203,12 @@ compare_permissions() {
do
if [ ! -f ${OTHER_DIR}/$f ]; then continue; fi
if [ ! -f ${THIS_DIR}/$f ]; then continue; fi
- OP=`ls -l ${OTHER_DIR}/$f | awk '{printf("%.10s\n", $1);}'`
- TP=`ls -l ${THIS_DIR}/$f | awk '{printf("%.10s\n", $1);}'`
+ OP=`ls -l ${OTHER_DIR}/$f | $AWK '{printf("%.10s\n", $1);}'`
+ TP=`ls -l ${THIS_DIR}/$f | $AWK '{printf("%.10s\n", $1);}'`
if [ "$OP" != "$TP" ]
then
if [ -z "$found" ]; then echo ; found="yes"; fi
- $PRINTF "\tother: ${OP} this: ${TP}\t$f\n"
+ $PRINTF "\tother: %s this: %s\t%s\n" "${OP}" "${TP}" "$f"
fi
done
if [ -z "$found" ]; then
@@ -260,7 +260,7 @@ compare_file_types() {
continue
else
if [ -z "$found" ]; then echo ; found="yes"; fi
- $PRINTF "\tother: ${OF}\n\tthis : ${TF}\n"
+ $PRINTF "\tother: %s\n\tthis : %s\n" "${OF}" "${TF}"
fi
fi
done
diff --git a/make/scripts/fixpath.sh b/make/scripts/fixpath.sh
index 3a886fee07c47..8eaf57d18f380 100644
--- a/make/scripts/fixpath.sh
+++ b/make/scripts/fixpath.sh
@@ -1,6 +1,6 @@
#!/bin/bash
#
-# Copyright (c) 2020, 2023, Oracle and/or its affiliates. All rights reserved.
+# Copyright (c) 2020, 2025, Oracle and/or its affiliates. All rights reserved.
# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
#
# This code is free software; you can redistribute it and/or modify it
@@ -157,11 +157,21 @@ function import_path() {
if [[ $? -eq 0 && -e "$unixpath" ]]; then
if [[ ! "$winpath" =~ ^"$ENVROOT"\\.*$ ]] ; then
# If it is not in envroot, it's a generic windows path
- if [[ ! $winpath =~ ^[-_.:\\a-zA-Z0-9]*$ ]] ; then
+ if [[ ! $winpath =~ ^[-_.:~\\a-zA-Z0-9]*$ ]] ; then
# Path has forbidden characters, rewrite as short name
# This monster of a command uses the %~s support from cmd.exe to
# reliably convert to short paths on all winenvs.
shortpath="$($CMD /q /c for %I in \( "$winpath" \) do echo %~sI 2>/dev/null | tr -d \\n\\r)"
+ if [[ ! $shortpath =~ ^[-_.:~\\a-zA-Z0-9]*$ ]] ; then
+ if [[ $QUIET != true ]]; then
+ echo fixpath: failure: Path "'"$path"'" could not be converted to short path >&2
+ fi
+ if [[ $IGNOREFAILURES != true ]]; then
+ exit 1
+ else
+ shortpath=""
+ fi
+ fi
unixpath="$($PATHTOOL -u "$shortpath")"
# unixpath is based on short name
fi
diff --git a/make/scripts/generate-symbol-data.sh b/make/scripts/generate-symbol-data.sh
index 015beb582fdaf..6f38d8730093c 100644
--- a/make/scripts/generate-symbol-data.sh
+++ b/make/scripts/generate-symbol-data.sh
@@ -68,8 +68,7 @@ if [ "`git status --porcelain=v1 .`x" != "x" ] ; then
exit 1
fi;
-$1/bin/java --add-exports jdk.jdeps/com.sun.tools.classfile=ALL-UNNAMED \
- --add-exports jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED \
+$1/bin/java --add-exports jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED \
--add-exports jdk.compiler/com.sun.tools.javac.jvm=ALL-UNNAMED \
--add-exports jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED \
--add-modules jdk.jdeps \
diff --git a/make/scripts/update_copyright_year.sh b/make/scripts/update_copyright_year.sh
index bb61d48c91cc9..578ab4cbc9923 100644
--- a/make/scripts/update_copyright_year.sh
+++ b/make/scripts/update_copyright_year.sh
@@ -1,7 +1,7 @@
#!/bin/bash -f
#
-# Copyright (c) 2010, 2024, Oracle and/or its affiliates. All rights reserved.
+# Copyright (c) 2010, 2025, Oracle and/or its affiliates. All rights reserved.
# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
#
# This code is free software; you can redistribute it and/or modify it
@@ -41,10 +41,6 @@ set -e
# To allow total changes counting
shopt -s lastpipe
-# Get an absolute path to this script, since that determines the top-level directory.
-this_script_dir=`dirname $0`
-this_script_dir=`cd $this_script_dir > /dev/null && pwd`
-
# Temp area
tmp=/tmp/`basename $0`.${USER}.$$
rm -f -r ${tmp}
@@ -98,10 +94,16 @@ while getopts "c:fhy:" option; do
done
# VCS check
+git_installed=false
+which git > /dev/null && git_installed=true
+if [ "$git_installed" != "true" ]; then
+ echo "Error: This script requires git. Please install it."
+ exit 1
+fi
git_found=false
-[ -d "${this_script_dir}/../../.git" ] && git_found=true
+git status &> /dev/null && git_found=true
if [ "$git_found" != "true" ]; then
- echo "Error: Please execute script from within make/scripts."
+ echo "Error: Please execute script from within a JDK git repository."
exit 1
else
echo "Using Git version control system"
diff --git a/make/test/BuildMicrobenchmark.gmk b/make/test/BuildMicrobenchmark.gmk
index 92f40472c3cb9..347ca44d25f39 100644
--- a/make/test/BuildMicrobenchmark.gmk
+++ b/make/test/BuildMicrobenchmark.gmk
@@ -119,7 +119,6 @@ $(JMH_UNPACKED_JARS_DONE): $(JMH_RUNTIME_JARS)
$(foreach jar, $(JMH_RUNTIME_JARS), \
$$($(UNZIP) -oq $(jar) -d $(JMH_UNPACKED_DIR)))
$(RM) -r $(JMH_UNPACKED_DIR)/META-INF
- $(RM) $(JMH_UNPACKED_DIR)/*.xml
$(TOUCH) $@
# Copy dependency files for inclusion in the benchmark JARs
diff --git a/src/demo/share/java2d/J2DBench/resources/textdata/arabic.ut8.txt b/src/demo/share/java2d/J2DBench/resources/textdata/arabic.ut8.txt
index 59447654660b7..ebb6c2eb3982a 100644
--- a/src/demo/share/java2d/J2DBench/resources/textdata/arabic.ut8.txt
+++ b/src/demo/share/java2d/J2DBench/resources/textdata/arabic.ut8.txt
@@ -1,4 +1,4 @@
-ما هي الشفرة الموحدة "يونِكود" ؟
+ما هي الشفرة الموحدة "يونِكود" ؟
أساسًا، تتعامل الحواسيب فقط مع الأرقام، وتقوم بتخزين الأحرف والمحارف الأخرى بعد أن تُعطي رقما معينا لكل واحد منها. وقبل اختراع "يونِكود"، كان هناك مئات الأنظمة للتشفير وتخصيص هذه الأرقام للمحارف، ولم يوجد نظام تشفير واحد يحتوي على جميع المحارف الضرورية. وعلى سبيل المثال، فإن الاتحاد الأوروبي لوحده، احتوى العديد من الشفرات المختلفة ليغطي جميع اللغات المستخدمة في الاتحاد. وحتى لو اعتبرنا لغة واحدة، كاللغة الإنجليزية، فإن جدول شفرة واحد لم يكف لاستيعاب جميع الأحرف وعلامات الترقيم والرموز الفنية والعلمية الشائعة الاستعمال.
@@ -8,4 +8,4 @@
تخصص الشفرة الموحدة "يونِكود" رقما وحيدا لكل محرف في جميع اللغات العالمية، وذلك بغض النظر عن نوع الحاسوب أو البرامج المستخدمة. وقد تـم تبني مواصفة "يونِكود" مــن قبـل قادة الصانعين لأنظمة الحواسيب فـي العالم، مثل شركات آي.بي.إم. (IBM)، أبـل (APPLE)، هِيـْولِـت بـاكـرد (Hewlett-Packard) ، مايكروسوفت (Microsoft)، أوراكِـل (Oracle) ، صن (Sun) وغيرها. كما أن المواصفات والمقاييس الحديثة (مثل لغة البرمجة "جافا" "JAVA" ولغة "إكس إم إل" "XML" التي تستخدم لبرمجة الانترنيت) تتطلب استخدام "يونِكود". علاوة على ذلك ، فإن "يونِكود" هي الطـريـقـة الرسـمية لتطبيق المقيـاس الـعـالـمي إيزو ١٠٦٤٦ (ISO 10646) .
-إن بزوغ مواصفة "يونِكود" وتوفُّر الأنظمة التي تستخدمه وتدعمه، يعتبر من أهم الاختراعات الحديثة في عولمة البرمجيات لجميع اللغات في العالم. وإن استخدام "يونِكود" في عالم الانترنيت سيؤدي إلى توفير كبير مقارنة مع استخدام المجموعات التقليدية للمحارف المشفرة. كما أن استخدام "يونِكود" سيُمكِّن المبرمج من كتابة البرنامج مرة واحدة، واستخدامه على أي نوع من الأجهزة أو الأنظمة، ولأي لغة أو دولة في العالم أينما كانت، دون الحاجة لإعادة البرمجة أو إجراء أي تعديل. وأخيرا، فإن استخدام "يونِكود" سيمكن البيانات من الانتقال عبر الأنظمة والأجهزة المختلفة دون أي خطورة لتحريفها، مهما تعددت الشركات الصانعة للأنظمة واللغات، والدول التي تمر من خلالها هذه البيانات.
\ No newline at end of file
+إن بزوغ مواصفة "يونِكود" وتوفُّر الأنظمة التي تستخدمه وتدعمه، يعتبر من أهم الاختراعات الحديثة في عولمة البرمجيات لجميع اللغات في العالم. وإن استخدام "يونِكود" في عالم الانترنيت سيؤدي إلى توفير كبير مقارنة مع استخدام المجموعات التقليدية للمحارف المشفرة. كما أن استخدام "يونِكود" سيُمكِّن المبرمج من كتابة البرنامج مرة واحدة، واستخدامه على أي نوع من الأجهزة أو الأنظمة، ولأي لغة أو دولة في العالم أينما كانت، دون الحاجة لإعادة البرمجة أو إجراء أي تعديل. وأخيرا، فإن استخدام "يونِكود" سيمكن البيانات من الانتقال عبر الأنظمة والأجهزة المختلفة دون أي خطورة لتحريفها، مهما تعددت الشركات الصانعة للأنظمة واللغات، والدول التي تمر من خلالها هذه البيانات.
diff --git a/src/demo/share/java2d/J2DBench/resources/textdata/english.ut8.txt b/src/demo/share/java2d/J2DBench/resources/textdata/english.ut8.txt
index dff289d24171f..8689e9e2d9e53 100644
--- a/src/demo/share/java2d/J2DBench/resources/textdata/english.ut8.txt
+++ b/src/demo/share/java2d/J2DBench/resources/textdata/english.ut8.txt
@@ -1,4 +1,4 @@
-What is Unicode?
+What is Unicode?
Unicode provides a unique number for every character,
no matter what the platform,
no matter what the program,
@@ -16,4 +16,4 @@ Incorporating Unicode into client-server or multi-tiered applications and websit
About the Unicode Consortium
The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard, which specifies the representation of text in modern software products and standards. The membership of the consortium represents a broad spectrum of corporations and organizations in the computer and information processing industry. The consortium is supported financially solely through membership dues. Membership in the Unicode Consortium is open to organizations and individuals anywhere in the world who support the Unicode Standard and wish to assist in its extension and implementation.
-For more information, see the Glossary, Unicode Enabled Products, Technical Introduction and Useful Resources.
\ No newline at end of file
+For more information, see the Glossary, Unicode Enabled Products, Technical Introduction and Useful Resources.
diff --git a/src/demo/share/java2d/J2DBench/resources/textdata/greek.ut8.txt b/src/demo/share/java2d/J2DBench/resources/textdata/greek.ut8.txt
index 360301c165c03..ce7b75c404817 100644
--- a/src/demo/share/java2d/J2DBench/resources/textdata/greek.ut8.txt
+++ b/src/demo/share/java2d/J2DBench/resources/textdata/greek.ut8.txt
@@ -1,4 +1,4 @@
-Τι είναι το Unicode?
+Τι είναι το Unicode?
Η κωδικοσελίδα Unicode προτείνει έναν και μοναδικό αριθμό για κάθε χαρακτήρα,
ανεξάρτητα από το λειτουργικό σύστημα,
diff --git a/src/demo/share/java2d/J2DBench/resources/textdata/hebrew.ut8.txt b/src/demo/share/java2d/J2DBench/resources/textdata/hebrew.ut8.txt
index 274c77e7b355a..06a91c9784663 100644
--- a/src/demo/share/java2d/J2DBench/resources/textdata/hebrew.ut8.txt
+++ b/src/demo/share/java2d/J2DBench/resources/textdata/hebrew.ut8.txt
@@ -1,10 +1,10 @@
-
-יוניקוד מקצה מספר ייחודי לכל תו,
-לא משנה על איזו פלטפורמה,
-לא משנה באיזו תוכנית,
+
+יוניקוד מקצה מספר ייחודי לכל תו,
+לא משנה על איזו פלטפורמה,
+לא משנה באיזו תוכנית,
ולא משנה באיזו שפה.
-באופן בסיסי, מחשבים עוסקים רק במספרים. הם מאחסנים אותיות ותווים אחרים על-ידי הקצאת מספר לכל אחד מהם. בטרם הומצא היוניקוד, היו מאות מערכות קידוד שונות להקצאת המספרים הללו. אף לא אחת מהן יכלה להכיל כמות תווים מספקת. לדוגמא: רק לאיחוד האירופאי נדרשים כמה סוגי קידודים שונים על מנת לכסות את כל השפות המדוברות בו. יתירה מזאת אף לשפה בודדת, כמו אנגלית למשל, לא היה די במערכת קידוד אחת בעבור כל האותיות, סימני הפיסוק והסמלים הטכניים שבשימוש שוטף.
+באופן בסיסי, מחשבים עוסקים רק במספרים. הם מאחסנים אותיות ותווים אחרים על-ידי הקצאת מספר לכל אחד מהם. בטרם הומצא היוניקוד, היו מאות מערכות קידוד שונות להקצאת המספרים הללו. אף לא אחת מהן יכלה להכיל כמות תווים מספקת. לדוגמא: רק לאיחוד האירופאי נדרשים כמה סוגי קידודים שונים על מנת לכסות את כל השפות המדוברות בו. יתירה מזאת אף לשפה בודדת, כמו אנגלית למשל, לא היה די במערכת קידוד אחת בעבור כל האותיות, סימני הפיסוק והסמלים הטכניים שבשימוש שוטף.
מערכות קידוד אלו אף סותרות זו את זו. כלומר, שני קידודים יכולים להשתמש באותו מספר לשני תוים נבדלים, או להשתמש במספרים שונים לאותו תו. על כל מחשב (ובמיוחד שרתים) לתמוך במספר רב של מערכות קידוד שונות; אולם כל אימת שנתונים עוברים בין מערכות קידוד או פלטפורמות שונות קיים הסיכון שייפגמו.
@@ -13,7 +13,7 @@
שילוב יוניקוד ביישומי שרת-לקוח או ביישומים רבי-שכבות ובאתרי אינטרנט מאפשר חיסכון ניכר בעלויות לעומת השימוש בסדרות התווים המסורתיות. הודות ליוניקוד, מוצר תוכנה אחד או אתר יחיד ברשת יכול להרחיב את יעדיו למגוון פלטפורמות, ארצות ושפות ללא צורך בשינויים מרחיקים. יוניקוד מאפשר מעבר נתונים דרך מערכות רבות ושונות מבלי שייפגמו.
-פרטים אודות הקונסורציום של יוניקוד (Unicode Consortium)
+פרטים אודות הקונסורציום של יוניקוד (Unicode Consortium)
הקונסורציום של יוניקוד הוא ארגון ללא מטרת רווח שנוסד כדי לפתח, להרחיב ולקדם את השימוש בתקן יוניקוד, אשר מגדיר את ייצוג הטקסט במוצרי תוכנה ותקנים מודרניים. חברים בקונסורציום מגוון רחב של תאגידים וארגונים בתעשיית המחשבים ועיבוד המידע. הקונסורציום ממומן על-ידי דמי-חבר בלבד. החברות בקונסורציום יוניקוד פתוחה לארגונים ולאנשים פרטיים, בכל רחבי העולם, אשר תומכים בתקן יוניקוד ומעוניינים לסייע בהתפתחותו והטמעתו.
-למידע נוסף, ראה מילון מונחים, רשימה חלקית של מוצרים מותאמים ליוניקוד, מבוא טכני ו- חומרי עזר [קישורים באנגלית].
\ No newline at end of file
+למידע נוסף, ראה מילון מונחים, רשימה חלקית של מוצרים מותאמים ליוניקוד, מבוא טכני ו- חומרי עזר [קישורים באנגלית].
diff --git a/src/demo/share/java2d/J2DBench/resources/textdata/hindi.ut8.txt b/src/demo/share/java2d/J2DBench/resources/textdata/hindi.ut8.txt
index d5714d31adc27..3fe9ae797e92c 100644
--- a/src/demo/share/java2d/J2DBench/resources/textdata/hindi.ut8.txt
+++ b/src/demo/share/java2d/J2DBench/resources/textdata/hindi.ut8.txt
@@ -1,4 +1,4 @@
-यूनिकोड क्या है?
+यूनिकोड क्या है?
यूनिकोड प्रत्येक अक्षर के लिए एक विशेष नम्बर प्रदान करता है,
चाहे कोई भी प्लैटफॉर्म हो,
चाहे कोई भी प्रोग्राम हो,
@@ -16,4 +16,4 @@
यूनिकोड कन्सॉर्शियम के बारे में
यूनिकोड कन्सॉर्शियम, लाभ न कमाने वाला एक संगठन है जिसकी स्थापना यूनिकोड स्टैंडर्ड, जो आधुनिक सॉफ्टवेयर उत्पादों और मानकों में पाठ की प्रस्तुति को निर्दिष्ट करता है, के विकास, विस्तार और इसके प्रयोग को बढ़ावा देने के लिए की गई थी। इस कन्सॉर्शियम के सदस्यों में, कम्प्यूटर और सूचना उद्योग में विभिन्न निगम और संगठन शामिल हैं। इस कन्सॉर्शियम का वित्तपोषण पूर्णतः सदस्यों के शुल्क से किया जाता है। यूनिकोड कन्सॉर्शियम में सदस्यता, विश्व में कहीं भी स्थित उन संगठनों और व्यक्तियों के लिए खुली है जो यूनिकोड का समर्थन करते हैं और जो इसके विस्तार और कार्यान्वयन में सहायता करना चाहते हैं।
-अधिक जानकारी के लिए, शब्दावली, सैम्पल यूनिकोड-सक्षम उत्पाद, तकनीकी परिचय और उपयोगी स्रोत देखिए।
\ No newline at end of file
+अधिक जानकारी के लिए, शब्दावली, सैम्पल यूनिकोड-सक्षम उत्पाद, तकनीकी परिचय और उपयोगी स्रोत देखिए।
diff --git a/src/demo/share/java2d/J2DBench/resources/textdata/japanese.ut8.txt b/src/demo/share/java2d/J2DBench/resources/textdata/japanese.ut8.txt
index 6d92e72ec4ca8..a4da2ce52f300 100644
--- a/src/demo/share/java2d/J2DBench/resources/textdata/japanese.ut8.txt
+++ b/src/demo/share/java2d/J2DBench/resources/textdata/japanese.ut8.txt
@@ -1,4 +1,4 @@
-ユニコードとは何か?
+ユニコードとは何か?
ユニコードは、すべての文字に固有の番号を付与します
プラットフォームには依存しません
プログラムにも依存しません
@@ -16,4 +16,4 @@
ユニコードコンソーシアムについて
ユニコードコンソーシアムは、最新のソフトウエア製品と標準においてテキストを表現することを意味する“ユニコード標準”の構築、発展、普及、利用促進を目的として設立された非営利組織です。同コンソーシアムの会員は、コンピューターと情報処理に係わる広汎な企業や組織から構成されています。同コンソーシアムは、財政的には、純粋に会費のみによって運営されています。ユニコード標準を支持し、その拡張と実装を支援する世界中の組織や個人は、だれもがユニコードコンソーシアムの会員なることができます。
-より詳しいことをお知りになりたい方は、Glossary, Unicode-Enabled Products, Technical Introduction および Useful Resourcesをご参照ください。
\ No newline at end of file
+より詳しいことをお知りになりたい方は、Glossary, Unicode-Enabled Products, Technical Introduction および Useful Resourcesをご参照ください。
diff --git a/src/demo/share/java2d/J2DBench/resources/textdata/korean.ut8.txt b/src/demo/share/java2d/J2DBench/resources/textdata/korean.ut8.txt
index cfbf3e0722fcc..4e7bce8a38c47 100644
--- a/src/demo/share/java2d/J2DBench/resources/textdata/korean.ut8.txt
+++ b/src/demo/share/java2d/J2DBench/resources/textdata/korean.ut8.txt
@@ -1,4 +1,4 @@
-유니코드에 대해 ?
+유니코드에 대해 ?
어떤 플랫폼,
어떤 프로그램,
어떤 언어에도 상관없이
@@ -16,4 +16,4 @@
유니코드 콘소시엄에 대해
유니코드 콘소시엄은 비영리 조직으로서 현대 소프트웨어 제품과 표준에서 텍스트의 표현을 지정하는 유니코드 표준의 사용을 개발하고 확장하며 장려하기 위해 세워졌습니다. 콘소시엄 멤버쉽은 컴퓨터와 정보 처리 산업에 종사하고 있는 광범위한 회사 및 조직의 범위를 나타냅니다. 콘소시엄의 재정은 전적으로 회비에 의해 충당됩니다. 유니코드 컨소시엄에서의 멤버쉽은 전 세계 어느 곳에서나 유니코드 표준을 지원하고 그 확장과 구현을 지원하고자하는 조직과 개인에게 개방되어 있습니다.
-더 자세한 내용은 용어집, 예제 유니코드 사용 가능 제품, 기술 정보 및 기타 유용한 정보를 참조하십시오.
\ No newline at end of file
+더 자세한 내용은 용어집, 예제 유니코드 사용 가능 제품, 기술 정보 및 기타 유용한 정보를 참조하십시오.
diff --git a/src/demo/share/java2d/J2DBench/resources/textdata/thai.ut8.txt b/src/demo/share/java2d/J2DBench/resources/textdata/thai.ut8.txt
index ff961a7ce0f27..7645e67aa441e 100644
--- a/src/demo/share/java2d/J2DBench/resources/textdata/thai.ut8.txt
+++ b/src/demo/share/java2d/J2DBench/resources/textdata/thai.ut8.txt
@@ -1,10 +1,10 @@
-Unicode คืออะไร?
+Unicode คืออะไร?
Unicode กำหนดหมายเลขเฉพาะสำหรับทุกอักขระ
โดยไม่สนใจว่าเป็นแพล็ตฟอร์มใด
ไม่ขึ้นกับว่าจะเป็นโปรแกรมใด
และไม่ว่าจะเป็นภาษาใด
-โดยพื้นฐานแล้ว, คอมพิวเตอร์จะเกี่ยวข้องกับเรื่องของตัวเลข. คอมพิวเตอร์จัดเก็บตัวอักษรและอักขระอื่นๆ โดยการกำหนดหมายเลขให้สำหรับแต่ละตัว. ก่อนหน้าที่๊ Unicode จะถูกสร้างขึ้น, ได้มีระบบ encoding อยู่หลายร้อยระบบสำหรับการกำหนดหมายเลขเหล่านี้. ไม่มี encoding ใดที่มีจำนวนตัวอักขระมากเพียงพอ: ยกตัวอย่างเช่น, เฉพาะในกลุ่มสหภาพยุโรปเพียงแห่งเดียว ก็ต้องการหลาย encoding ในการครอบคลุมทุกภาษาในกลุ่ม. หรือแม้แต่ในภาษาเดี่ยว เช่น ภาษาอังกฤษ ก็ไม่มี encoding ใดที่เพียงพอสำหรับทุกตัวอักษร, เครื่องหมายวรรคตอน และสัญลักษณ์ทางเทคนิคที่ใช้กันอยู่ทั่วไป.
+โดยพื้นฐานแล้ว, คอมพิวเตอร์จะเกี่ยวข้องกับเรื่องของตัวเลข. คอมพิวเตอร์จัดเก็บตัวอักษรและอักขระอื่นๆ โดยการกำหนดหมายเลขให้สำหรับแต่ละตัว. ก่อนหน้าที่๊ Unicode จะถูกสร้างขึ้น, ได้มีระบบ encoding อยู่หลายร้อยระบบสำหรับการกำหนดหมายเลขเหล่านี้. ไม่มี encoding ใดที่มีจำนวนตัวอักขระมากเพียงพอ: ยกตัวอย่างเช่น, เฉพาะในกลุ่มสหภาพยุโรปเพียงแห่งเดียว ก็ต้องการหลาย encoding ในการครอบคลุมทุกภาษาในกลุ่ม. หรือแม้แต่ในภาษาเดี่ยว เช่น ภาษาอังกฤษ ก็ไม่มี encoding ใดที่เพียงพอสำหรับทุกตัวอักษร, เครื่องหมายวรรคตอน และสัญลักษณ์ทางเทคนิคที่ใช้กันอยู่ทั่วไป.
ระบบ encoding เหล่านี้ยังขัดแย้งซึ่งกันและกัน. นั่นก็คือ, ในสอง encoding สามารถใช้หมายเลขเดียวกันสำหรับตัวอักขระสองตัวที่แตกต่างกัน,หรือใช้หมายเลขต่างกันสำหรับอักขระตัวเดียวกัน. ในระบบคอมพิวเตอร์ (โดยเฉพาะเซิร์ฟเวอร์) ต้องมีการสนับสนุนหลาย encoding; และเมื่อข้อมูลที่ผ่านไปมาระหว่างการเข้ารหัสหรือแพล็ตฟอร์มที่ต่างกัน, ข้อมูลนั้นจะเสี่ยงต่อการผิดพลาดเสียหาย.
@@ -14,6 +14,6 @@ Unicode กำหนดหมายเลขเฉพาะสำหรับแ
การรวม Unicode เข้าไปในระบบไคลเอ็นต์-เซิร์ฟเวอร์ หรือแอ็พพลิเคชันแบบ multi-tiered และเว็บไซต์ จะทำให้เกิดการประหยัดค่าใช้จ่ายมากกว่าการใช้ชุดอักขระแบบเดิม. Unicode ทำให้ผลิตภัณฑ์ซอฟต์แวร์หนึ่งเดียว หรือเว็บไซต์แห่งเดียว รองรับได้หลายแพล็ตฟอร์ม, หลายภาษาและหลายประเทศโดยไม่ต้องทำการรื้อปรับระบบ. Unicode ยังทำให้ข้อมูลสามารถเคลื่อนย้ายไปมาในหลายๆ ระบบโดยไม่เกิดความผิดพลาดเสียหาย.
เกี่ยวกับ Unicode Consortium
-Unicode Consortium เป็นองค์กรไม่แสวงหากำไรที่ก่อตั้งขึ้นเพื่อพัฒนา, ขยายและส่งเสริมการใช้ Unicode Standard, ซึ่งกำหนดรูปแบบการแทนค่าของข้อความในผลิตภัณฑ์ซอฟต์แวร์และมาตรฐานใหม่ๆ. สมาชิกของสมาคมเป็นตัวแทนจากบริษัทและองค์กรในอุตสาหกรรมคอมพิวเตอร์และการประมวลผลสารสนเทศ. สมาคมได้รับการสนับสนุนทางการเงินผ่านทางค่าธรรมเนียมของการเป็นสมาชิกเท่านั้น. สมาชิกภาพของ Unicode Consortium เปิดกว้างสำหรับองค์กรหรือบุคคลใดๆ ในโลกที่ต้องการสนับสนุน Unicode Standard และช่วยเหลือการขยายตัวและการนำ Unicode ไปใช้งาน.
+Unicode Consortium เป็นองค์กรไม่แสวงหากำไรที่ก่อตั้งขึ้นเพื่อพัฒนา, ขยายและส่งเสริมการใช้ Unicode Standard, ซึ่งกำหนดรูปแบบการแทนค่าของข้อความในผลิตภัณฑ์ซอฟต์แวร์และมาตรฐานใหม่ๆ. สมาชิกของสมาคมเป็นตัวแทนจากบริษัทและองค์กรในอุตสาหกรรมคอมพิวเตอร์และการประมวลผลสารสนเทศ. สมาคมได้รับการสนับสนุนทางการเงินผ่านทางค่าธรรมเนียมของการเป็นสมาชิกเท่านั้น. สมาชิกภาพของ Unicode Consortium เปิดกว้างสำหรับองค์กรหรือบุคคลใดๆ ในโลกที่ต้องการสนับสนุน Unicode Standard และช่วยเหลือการขยายตัวและการนำ Unicode ไปใช้งาน.
-สำหรับข้อมูลเพิ่มเติม, ให้ดูที่ Glossary, Sample Unicode-Enabled Products, Technical Introduction และ Useful Resources.
\ No newline at end of file
+สำหรับข้อมูลเพิ่มเติม, ให้ดูที่ Glossary, Sample Unicode-Enabled Products, Technical Introduction และ Useful Resources.
diff --git a/src/demo/share/jfc/CodePointIM/README_zh_CN.html b/src/demo/share/jfc/CodePointIM/README_zh_CN.html
index 782d4288f7e60..118e5a2109231 100644
--- a/src/demo/share/jfc/CodePointIM/README_zh_CN.html
+++ b/src/demo/share/jfc/CodePointIM/README_zh_CN.html
@@ -1,4 +1,4 @@
-
+
自述文件——代码点输入法
diff --git a/src/demo/share/jfc/CodePointIM/com/sun/inputmethods/internal/codepointim/CodePointInputMethodDescriptor.java b/src/demo/share/jfc/CodePointIM/com/sun/inputmethods/internal/codepointim/CodePointInputMethodDescriptor.java
index 320a1ce7e669a..b608c92494236 100644
--- a/src/demo/share/jfc/CodePointIM/com/sun/inputmethods/internal/codepointim/CodePointInputMethodDescriptor.java
+++ b/src/demo/share/jfc/CodePointIM/com/sun/inputmethods/internal/codepointim/CodePointInputMethodDescriptor.java
@@ -63,7 +63,7 @@ public CodePointInputMethodDescriptor() {
* Creates a new instance of the Code Point input method.
*
* @return a new instance of the Code Point input method
- * @exception Exception any exception that may occur while creating the
+ * @throws Exception any exception that may occur while creating the
* input method instance
*/
public InputMethod createInputMethod() throws Exception {
diff --git a/src/hotspot/cpu/aarch64/aarch64.ad b/src/hotspot/cpu/aarch64/aarch64.ad
index 5bfe697d12c0b..1b128f2a0cec9 100644
--- a/src/hotspot/cpu/aarch64/aarch64.ad
+++ b/src/hotspot/cpu/aarch64/aarch64.ad
@@ -1,5 +1,5 @@
//
-// Copyright (c) 2003, 2024, Oracle and/or its affiliates. All rights reserved.
+// Copyright (c) 2003, 2025, Oracle and/or its affiliates. All rights reserved.
// Copyright (c) 2014, 2024, Red Hat, Inc. All rights reserved.
// DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
//
@@ -2296,6 +2296,26 @@ bool Matcher::match_rule_supported(int opcode) {
return false;
}
break;
+ case Op_FmaHF:
+ // UseFMA flag also needs to be checked along with FEAT_FP16
+ if (!UseFMA || !is_feat_fp16_supported()) {
+ return false;
+ }
+ break;
+ case Op_AddHF:
+ case Op_SubHF:
+ case Op_MulHF:
+ case Op_DivHF:
+ case Op_MinHF:
+ case Op_MaxHF:
+ case Op_SqrtHF:
+ // Half-precision floating point scalar operations require FEAT_FP16
+ // to be available. FEAT_FP16 is enabled if both "fphp" and "asimdhp"
+ // features are supported.
+ if (!is_feat_fp16_supported()) {
+ return false;
+ }
+ break;
}
return true; // Per default match rules are supported.
@@ -2306,11 +2326,11 @@ const RegMask* Matcher::predicate_reg_mask(void) {
}
bool Matcher::supports_vector_calling_convention(void) {
- return EnableVectorSupport && UseVectorStubs;
+ return EnableVectorSupport;
}
OptoRegPair Matcher::vector_return_value(uint ideal_reg) {
- assert(EnableVectorSupport && UseVectorStubs, "sanity");
+ assert(EnableVectorSupport, "sanity");
int lo = V0_num;
int hi = V0_H_num;
if (ideal_reg == Op_VecX || ideal_reg == Op_VecA) {
@@ -4599,6 +4619,15 @@ operand immF0()
interface(CONST_INTER);
%}
+// Half Float (FP16) Immediate
+operand immH()
+%{
+ match(ConH);
+ op_cost(0);
+ format %{ %}
+ interface(CONST_INTER);
+%}
+
//
operand immFPacked()
%{
@@ -6942,6 +6971,21 @@ instruct loadConD(vRegD dst, immD con) %{
ins_pipe(fp_load_constant_d);
%}
+// Load Half Float Constant
+// The "ldr" instruction loads a 32-bit word from the constant pool into a
+// 32-bit register but only the bottom half will be populated and the top
+// 16 bits are zero.
+instruct loadConH(vRegF dst, immH con) %{
+ match(Set dst con);
+ format %{
+ "ldrs $dst, [$constantaddress]\t# load from constant table: half float=$con\n\t"
+ %}
+ ins_encode %{
+ __ ldrs(as_FloatRegister($dst$$reg), $constantaddress($con));
+ %}
+ ins_pipe(fp_load_constant_s);
+%}
+
// Store Instructions
// Store Byte
@@ -8144,6 +8188,7 @@ instruct castPP(iRegPNoSp dst)
instruct castII(iRegI dst)
%{
+ predicate(VerifyConstraintCasts == 0);
match(Set dst (CastII dst));
size(0);
@@ -8153,8 +8198,22 @@ instruct castII(iRegI dst)
ins_pipe(pipe_class_empty);
%}
+instruct castII_checked(iRegI dst, rFlagsReg cr)
+%{
+ predicate(VerifyConstraintCasts > 0);
+ match(Set dst (CastII dst));
+ effect(KILL cr);
+
+ format %{ "# castII_checked of $dst" %}
+ ins_encode %{
+ __ verify_int_in_range(_idx, bottom_type()->is_int(), $dst$$Register, rscratch1);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
instruct castLL(iRegL dst)
%{
+ predicate(VerifyConstraintCasts == 0);
match(Set dst (CastLL dst));
size(0);
@@ -8164,6 +8223,19 @@ instruct castLL(iRegL dst)
ins_pipe(pipe_class_empty);
%}
+instruct castLL_checked(iRegL dst, rFlagsReg cr)
+%{
+ predicate(VerifyConstraintCasts > 0);
+ match(Set dst (CastLL dst));
+ effect(KILL cr);
+
+ format %{ "# castLL_checked of $dst" %}
+ ins_encode %{
+ __ verify_long_in_range(_idx, bottom_type()->is_long(), $dst$$Register, rscratch1);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
instruct castFF(vRegF dst)
%{
match(Set dst (CastFF dst));
@@ -13606,6 +13678,17 @@ instruct bits_reverse_L(iRegLNoSp dst, iRegL src)
// ============================================================================
// Floating Point Arithmetic Instructions
+instruct addHF_reg_reg(vRegF dst, vRegF src1, vRegF src2) %{
+ match(Set dst (AddHF src1 src2));
+ format %{ "faddh $dst, $src1, $src2" %}
+ ins_encode %{
+ __ faddh($dst$$FloatRegister,
+ $src1$$FloatRegister,
+ $src2$$FloatRegister);
+ %}
+ ins_pipe(fp_dop_reg_reg_s);
+%}
+
instruct addF_reg_reg(vRegF dst, vRegF src1, vRegF src2) %{
match(Set dst (AddF src1 src2));
@@ -13636,6 +13719,17 @@ instruct addD_reg_reg(vRegD dst, vRegD src1, vRegD src2) %{
ins_pipe(fp_dop_reg_reg_d);
%}
+instruct subHF_reg_reg(vRegF dst, vRegF src1, vRegF src2) %{
+ match(Set dst (SubHF src1 src2));
+ format %{ "fsubh $dst, $src1, $src2" %}
+ ins_encode %{
+ __ fsubh($dst$$FloatRegister,
+ $src1$$FloatRegister,
+ $src2$$FloatRegister);
+ %}
+ ins_pipe(fp_dop_reg_reg_s);
+%}
+
instruct subF_reg_reg(vRegF dst, vRegF src1, vRegF src2) %{
match(Set dst (SubF src1 src2));
@@ -13666,6 +13760,17 @@ instruct subD_reg_reg(vRegD dst, vRegD src1, vRegD src2) %{
ins_pipe(fp_dop_reg_reg_d);
%}
+instruct mulHF_reg_reg(vRegF dst, vRegF src1, vRegF src2) %{
+ match(Set dst (MulHF src1 src2));
+ format %{ "fmulh $dst, $src1, $src2" %}
+ ins_encode %{
+ __ fmulh($dst$$FloatRegister,
+ $src1$$FloatRegister,
+ $src2$$FloatRegister);
+ %}
+ ins_pipe(fp_dop_reg_reg_s);
+%}
+
instruct mulF_reg_reg(vRegF dst, vRegF src1, vRegF src2) %{
match(Set dst (MulF src1 src2));
@@ -13696,6 +13801,20 @@ instruct mulD_reg_reg(vRegD dst, vRegD src1, vRegD src2) %{
ins_pipe(fp_dop_reg_reg_d);
%}
+// src1 * src2 + src3 (half-precision float)
+instruct maddHF_reg_reg(vRegF dst, vRegF src1, vRegF src2, vRegF src3) %{
+ match(Set dst (FmaHF src3 (Binary src1 src2)));
+ format %{ "fmaddh $dst, $src1, $src2, $src3" %}
+ ins_encode %{
+ assert(UseFMA, "Needs FMA instructions support.");
+ __ fmaddh($dst$$FloatRegister,
+ $src1$$FloatRegister,
+ $src2$$FloatRegister,
+ $src3$$FloatRegister);
+ %}
+ ins_pipe(pipe_class_default);
+%}
+
// src1 * src2 + src3
instruct maddF_reg_reg(vRegF dst, vRegF src1, vRegF src2, vRegF src3) %{
match(Set dst (FmaF src3 (Binary src1 src2)));
@@ -13837,6 +13956,29 @@ instruct mnsubD_reg_reg(vRegD dst, vRegD src1, vRegD src2, vRegD src3, immD0 zer
ins_pipe(pipe_class_default);
%}
+// Math.max(HH)H (half-precision float)
+instruct maxHF_reg_reg(vRegF dst, vRegF src1, vRegF src2) %{
+ match(Set dst (MaxHF src1 src2));
+ format %{ "fmaxh $dst, $src1, $src2" %}
+ ins_encode %{
+ __ fmaxh($dst$$FloatRegister,
+ $src1$$FloatRegister,
+ $src2$$FloatRegister);
+ %}
+ ins_pipe(fp_dop_reg_reg_s);
+%}
+
+// Math.min(HH)H (half-precision float)
+instruct minHF_reg_reg(vRegF dst, vRegF src1, vRegF src2) %{
+ match(Set dst (MinHF src1 src2));
+ format %{ "fminh $dst, $src1, $src2" %}
+ ins_encode %{
+ __ fminh($dst$$FloatRegister,
+ $src1$$FloatRegister,
+ $src2$$FloatRegister);
+ %}
+ ins_pipe(fp_dop_reg_reg_s);
+%}
// Math.max(FF)F
instruct maxF_reg_reg(vRegF dst, vRegF src1, vRegF src2) %{
@@ -13894,6 +14036,16 @@ instruct minD_reg_reg(vRegD dst, vRegD src1, vRegD src2) %{
ins_pipe(fp_dop_reg_reg_d);
%}
+instruct divHF_reg_reg(vRegF dst, vRegF src1, vRegF src2) %{
+ match(Set dst (DivHF src1 src2));
+ format %{ "fdivh $dst, $src1, $src2" %}
+ ins_encode %{
+ __ fdivh($dst$$FloatRegister,
+ $src1$$FloatRegister,
+ $src2$$FloatRegister);
+ %}
+ ins_pipe(fp_div_s);
+%}
instruct divF_reg_reg(vRegF dst, vRegF src1, vRegF src2) %{
match(Set dst (DivF src1 src2));
@@ -14067,6 +14219,16 @@ instruct sqrtF_reg(vRegF dst, vRegF src) %{
ins_pipe(fp_div_d);
%}
+instruct sqrtHF_reg(vRegF dst, vRegF src) %{
+ match(Set dst (SqrtHF src));
+ format %{ "fsqrth $dst, $src" %}
+ ins_encode %{
+ __ fsqrth($dst$$FloatRegister,
+ $src$$FloatRegister);
+ %}
+ ins_pipe(fp_div_s);
+%}
+
// Math.rint, floor, ceil
instruct roundD_reg(vRegD dst, vRegD src, immI rmode) %{
match(Set dst (RoundDoubleMode src rmode));
@@ -16297,7 +16459,8 @@ instruct ShouldNotReachHere() %{
ins_encode %{
if (is_reachable()) {
- __ stop(_halt_reason);
+ const char* str = __ code_string(_halt_reason);
+ __ stop(str);
}
%}
@@ -17115,6 +17278,64 @@ instruct expandBitsL_memcon(iRegINoSp dst, memory8 mem, immL mask,
ins_pipe(pipe_slow);
%}
+//----------------------------- Reinterpret ----------------------------------
+// Reinterpret a half-precision float value in a floating point register to a general purpose register
+instruct reinterpretHF2S(iRegINoSp dst, vRegF src) %{
+ match(Set dst (ReinterpretHF2S src));
+ format %{ "reinterpretHF2S $dst, $src" %}
+ ins_encode %{
+ __ smov($dst$$Register, $src$$FloatRegister, __ H, 0);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+// Reinterpret a half-precision float value in a general purpose register to a floating point register
+instruct reinterpretS2HF(vRegF dst, iRegINoSp src) %{
+ match(Set dst (ReinterpretS2HF src));
+ format %{ "reinterpretS2HF $dst, $src" %}
+ ins_encode %{
+ __ mov($dst$$FloatRegister, __ H, 0, $src$$Register);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+// Without this optimization, ReinterpretS2HF (ConvF2HF src) would result in the following
+// instructions (the first two are for ConvF2HF and the last instruction is for ReinterpretS2HF) -
+// fcvt $tmp1_fpr, $src_fpr // Convert float to half-precision float
+// mov $tmp2_gpr, $tmp1_fpr // Move half-precision float in FPR to a GPR
+// mov $dst_fpr, $tmp2_gpr // Move the result from a GPR to an FPR
+// The move from FPR to GPR in ConvF2HF and the move from GPR to FPR in ReinterpretS2HF
+// can be omitted in this pattern, resulting in -
+// fcvt $dst, $src // Convert float to half-precision float
+instruct convF2HFAndS2HF(vRegF dst, vRegF src)
+%{
+ match(Set dst (ReinterpretS2HF (ConvF2HF src)));
+ format %{ "convF2HFAndS2HF $dst, $src" %}
+ ins_encode %{
+ __ fcvtsh($dst$$FloatRegister, $src$$FloatRegister);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+// Without this optimization, ConvHF2F (ReinterpretHF2S src) would result in the following
+// instructions (the first one is for ReinterpretHF2S and the last two are for ConvHF2F) -
+// mov $tmp1_gpr, $src_fpr // Move the half-precision float from an FPR to a GPR
+// mov $tmp2_fpr, $tmp1_gpr // Move the same value from GPR to an FPR
+// fcvt $dst_fpr, $tmp2_fpr // Convert the half-precision float to 32-bit float
+// The move from FPR to GPR in ReinterpretHF2S and the move from GPR to FPR in ConvHF2F
+// can be omitted as the input (src) is already in an FPR required for the fcvths instruction
+// resulting in -
+// fcvt $dst, $src // Convert half-precision float to a 32-bit float
+instruct convHF2SAndHF2F(vRegF dst, vRegF src)
+%{
+ match(Set dst (ConvHF2F (ReinterpretHF2S src)));
+ format %{ "convHF2SAndHF2F $dst, $src" %}
+ ins_encode %{
+ __ fcvths($dst$$FloatRegister, $src$$FloatRegister);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
// ============================================================================
// This name is KNOWN by the ADLC and cannot be changed.
// The ADLC forces a 'TypeRawPtr::BOTTOM' output type
diff --git a/src/hotspot/cpu/aarch64/assembler_aarch64.hpp b/src/hotspot/cpu/aarch64/assembler_aarch64.hpp
index 3db7d30884429..5c02e30963eaa 100644
--- a/src/hotspot/cpu/aarch64/assembler_aarch64.hpp
+++ b/src/hotspot/cpu/aarch64/assembler_aarch64.hpp
@@ -2032,6 +2032,8 @@ void mvnw(Register Rd, Register Rm,
INSN(fsqrtd, 0b01, 0b000011);
INSN(fcvtd, 0b01, 0b000100); // Double-precision to single-precision
+ INSN(fsqrth, 0b11, 0b000011); // Half-precision sqrt
+
private:
void _fcvt_narrow_extend(FloatRegister Vd, SIMD_Arrangement Ta,
FloatRegister Vn, SIMD_Arrangement Tb, bool do_extend) {
@@ -2059,37 +2061,68 @@ void mvnw(Register Rd, Register Rm,
#undef INSN
// Floating-point data-processing (2 source)
- void data_processing(unsigned op31, unsigned type, unsigned opcode,
+ void data_processing(unsigned op31, unsigned type, unsigned opcode, unsigned op21,
FloatRegister Vd, FloatRegister Vn, FloatRegister Vm) {
starti;
f(op31, 31, 29);
f(0b11110, 28, 24);
- f(type, 23, 22), f(1, 21), f(opcode, 15, 10);
+ f(type, 23, 22), f(op21, 21), f(opcode, 15, 10);
rf(Vm, 16), rf(Vn, 5), rf(Vd, 0);
}
-#define INSN(NAME, op31, type, opcode) \
+#define INSN(NAME, op31, type, opcode, op21) \
void NAME(FloatRegister Vd, FloatRegister Vn, FloatRegister Vm) { \
- data_processing(op31, type, opcode, Vd, Vn, Vm); \
- }
-
- INSN(fabds, 0b011, 0b10, 0b110101);
- INSN(fmuls, 0b000, 0b00, 0b000010);
- INSN(fdivs, 0b000, 0b00, 0b000110);
- INSN(fadds, 0b000, 0b00, 0b001010);
- INSN(fsubs, 0b000, 0b00, 0b001110);
- INSN(fmaxs, 0b000, 0b00, 0b010010);
- INSN(fmins, 0b000, 0b00, 0b010110);
- INSN(fnmuls, 0b000, 0b00, 0b100010);
-
- INSN(fabdd, 0b011, 0b11, 0b110101);
- INSN(fmuld, 0b000, 0b01, 0b000010);
- INSN(fdivd, 0b000, 0b01, 0b000110);
- INSN(faddd, 0b000, 0b01, 0b001010);
- INSN(fsubd, 0b000, 0b01, 0b001110);
- INSN(fmaxd, 0b000, 0b01, 0b010010);
- INSN(fmind, 0b000, 0b01, 0b010110);
- INSN(fnmuld, 0b000, 0b01, 0b100010);
+ data_processing(op31, type, opcode, op21, Vd, Vn, Vm); \
+ }
+
+ INSN(fmuls, 0b000, 0b00, 0b000010, 0b1);
+ INSN(fdivs, 0b000, 0b00, 0b000110, 0b1);
+ INSN(fadds, 0b000, 0b00, 0b001010, 0b1);
+ INSN(fsubs, 0b000, 0b00, 0b001110, 0b1);
+ INSN(fmaxs, 0b000, 0b00, 0b010010, 0b1);
+ INSN(fmins, 0b000, 0b00, 0b010110, 0b1);
+ INSN(fnmuls, 0b000, 0b00, 0b100010, 0b1);
+
+ INSN(fmuld, 0b000, 0b01, 0b000010, 0b1);
+ INSN(fdivd, 0b000, 0b01, 0b000110, 0b1);
+ INSN(faddd, 0b000, 0b01, 0b001010, 0b1);
+ INSN(fsubd, 0b000, 0b01, 0b001110, 0b1);
+ INSN(fmaxd, 0b000, 0b01, 0b010010, 0b1);
+ INSN(fmind, 0b000, 0b01, 0b010110, 0b1);
+ INSN(fnmuld, 0b000, 0b01, 0b100010, 0b1);
+
+ // Half-precision floating-point instructions
+ INSN(fmulh, 0b000, 0b11, 0b000010, 0b1);
+ INSN(fdivh, 0b000, 0b11, 0b000110, 0b1);
+ INSN(faddh, 0b000, 0b11, 0b001010, 0b1);
+ INSN(fsubh, 0b000, 0b11, 0b001110, 0b1);
+ INSN(fmaxh, 0b000, 0b11, 0b010010, 0b1);
+ INSN(fminh, 0b000, 0b11, 0b010110, 0b1);
+ INSN(fnmulh, 0b000, 0b11, 0b100010, 0b1);
+#undef INSN
+
+// Advanced SIMD scalar three same
+#define INSN(NAME, U, size, opcode) \
+ void NAME(FloatRegister Vd, FloatRegister Vn, FloatRegister Vm) { \
+ starti; \
+ f(0b01, 31, 30), f(U, 29), f(0b11110, 28, 24), f(size, 23, 22), f(1, 21); \
+ rf(Vm, 16), f(opcode, 15, 11), f(1, 10), rf(Vn, 5), rf(Vd, 0); \
+ }
+
+ INSN(fabds, 0b1, 0b10, 0b11010); // Floating-point Absolute Difference (single-precision)
+ INSN(fabdd, 0b1, 0b11, 0b11010); // Floating-point Absolute Difference (double-precision)
+
+#undef INSN
+
+// Advanced SIMD scalar three same FP16
+#define INSN(NAME, U, a, opcode) \
+ void NAME(FloatRegister Vd, FloatRegister Vn, FloatRegister Vm) { \
+ starti; \
+ f(0b01, 31, 30), f(U, 29), f(0b11110, 28, 24), f(a, 23), f(0b10, 22, 21); \
+ rf(Vm, 16), f(0b00, 15, 14), f(opcode, 13, 11), f(1, 10), rf(Vn, 5), rf(Vd, 0); \
+ }
+
+ INSN(fabdh, 0b1, 0b1, 0b010); // Floating-point Absolute Difference (half-precision float)
#undef INSN
@@ -2120,6 +2153,7 @@ void mvnw(Register Rd, Register Rm,
INSN(fnmaddd, 0b000, 0b01, 1, 0);
INSN(fnmsub, 0b000, 0b01, 1, 1);
+ INSN(fmaddh, 0b000, 0b11, 0, 0); // half-precision fused multiply-add (scalar)
#undef INSN
// Floating-point conditional select
diff --git a/src/hotspot/cpu/aarch64/c1_CodeStubs_aarch64.cpp b/src/hotspot/cpu/aarch64/c1_CodeStubs_aarch64.cpp
index 2334cbdff24e4..2e53ecb805829 100644
--- a/src/hotspot/cpu/aarch64/c1_CodeStubs_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/c1_CodeStubs_aarch64.cpp
@@ -69,7 +69,7 @@ void RangeCheckStub::emit_code(LIR_Assembler* ce) {
__ far_call(RuntimeAddress(a));
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
return;
}
@@ -90,7 +90,7 @@ void RangeCheckStub::emit_code(LIR_Assembler* ce) {
__ blr(lr);
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
}
PredicateFailedStub::PredicateFailedStub(CodeEmitInfo* info) {
@@ -103,7 +103,7 @@ void PredicateFailedStub::emit_code(LIR_Assembler* ce) {
__ far_call(RuntimeAddress(a));
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
}
void DivByZeroStub::emit_code(LIR_Assembler* ce) {
@@ -274,7 +274,7 @@ void ImplicitNullCheckStub::emit_code(LIR_Assembler* ce) {
__ far_call(RuntimeAddress(a));
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
}
@@ -289,7 +289,7 @@ void SimpleExceptionStub::emit_code(LIR_Assembler* ce) {
}
__ far_call(RuntimeAddress(Runtime1::entry_for(_stub)), rscratch2);
ce->add_call_info_here(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
}
diff --git a/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp b/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp
index 6b1a5a7f1e0c4..afa2ddb47b454 100644
--- a/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp
@@ -72,16 +72,17 @@ int C1_MacroAssembler::lock_object(Register hdr, Register obj, Register disp_hdr
null_check_offset = offset();
- if (DiagnoseSyncOnValueBasedClasses != 0) {
- load_klass(hdr, obj);
- ldrb(hdr, Address(hdr, Klass::misc_flags_offset()));
- tst(hdr, KlassFlags::_misc_is_value_based_class);
- br(Assembler::NE, slow_case);
- }
-
if (LockingMode == LM_LIGHTWEIGHT) {
lightweight_lock(disp_hdr, obj, hdr, temp, rscratch2, slow_case);
} else if (LockingMode == LM_LEGACY) {
+
+ if (DiagnoseSyncOnValueBasedClasses != 0) {
+ load_klass(hdr, obj);
+ ldrb(hdr, Address(hdr, Klass::misc_flags_offset()));
+ tst(hdr, KlassFlags::_misc_is_value_based_class);
+ br(Assembler::NE, slow_case);
+ }
+
Label done;
// Load object header
ldr(hdr, Address(obj, hdr_offset));
diff --git a/src/hotspot/cpu/aarch64/c1_Runtime1_aarch64.cpp b/src/hotspot/cpu/aarch64/c1_Runtime1_aarch64.cpp
index 063918ee20b7b..a6aab24349a1f 100644
--- a/src/hotspot/cpu/aarch64/c1_Runtime1_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/c1_Runtime1_aarch64.cpp
@@ -91,10 +91,10 @@ int StubAssembler::call_RT(Register oop_result1, Register metadata_result, addre
// exception pending => remove activation and forward to exception handler
// make sure that the vm_results are cleared
if (oop_result1->is_valid()) {
- str(zr, Address(rthread, JavaThread::vm_result_offset()));
+ str(zr, Address(rthread, JavaThread::vm_result_oop_offset()));
}
if (metadata_result->is_valid()) {
- str(zr, Address(rthread, JavaThread::vm_result_2_offset()));
+ str(zr, Address(rthread, JavaThread::vm_result_metadata_offset()));
}
if (frame_size() == no_frame_size) {
leave();
@@ -108,10 +108,10 @@ int StubAssembler::call_RT(Register oop_result1, Register metadata_result, addre
}
// get oop results if there are any and reset the values in the thread
if (oop_result1->is_valid()) {
- get_vm_result(oop_result1, rthread);
+ get_vm_result_oop(oop_result1, rthread);
}
if (metadata_result->is_valid()) {
- get_vm_result_2(metadata_result, rthread);
+ get_vm_result_metadata(metadata_result, rthread);
}
return call_offset;
}
@@ -406,8 +406,8 @@ OopMapSet* Runtime1::generate_handle_exception(C1StubId id, StubAssembler *sasm)
__ authenticate_return_address(exception_pc);
// make sure that the vm_results are cleared (may be unnecessary)
- __ str(zr, Address(rthread, JavaThread::vm_result_offset()));
- __ str(zr, Address(rthread, JavaThread::vm_result_2_offset()));
+ __ str(zr, Address(rthread, JavaThread::vm_result_oop_offset()));
+ __ str(zr, Address(rthread, JavaThread::vm_result_metadata_offset()));
break;
case C1StubId::handle_exception_nofpu_id:
case C1StubId::handle_exception_id:
diff --git a/src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp b/src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp
index 605a05a44a731..585812a99eec2 100644
--- a/src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp
@@ -360,7 +360,7 @@ void C2_MacroAssembler::fast_lock_lightweight(Register obj, Register box, Regist
Label slow_path;
if (UseObjectMonitorTable) {
- // Clear cache in case fast locking succeeds.
+ // Clear cache in case fast locking succeeds or we need to take the slow-path.
str(zr, Address(box, BasicLock::object_monitor_cache_offset_in_bytes()));
}
@@ -2743,3 +2743,107 @@ bool C2_MacroAssembler::in_scratch_emit_size() {
}
return MacroAssembler::in_scratch_emit_size();
}
+
+static void abort_verify_int_in_range(uint idx, jint val, jint lo, jint hi) {
+ fatal("Invalid CastII, idx: %u, val: %d, lo: %d, hi: %d", idx, val, lo, hi);
+}
+
+void C2_MacroAssembler::verify_int_in_range(uint idx, const TypeInt* t, Register rval, Register rtmp) {
+ assert(!t->empty() && !t->singleton(), "%s", Type::str(t));
+ if (t == TypeInt::INT) {
+ return;
+ }
+ BLOCK_COMMENT("verify_int_in_range {");
+ Label L_success, L_failure;
+
+ jint lo = t->_lo;
+ jint hi = t->_hi;
+
+ if (lo != min_jint && hi != max_jint) {
+ subsw(rtmp, rval, lo);
+ br(Assembler::LT, L_failure);
+ subsw(rtmp, rval, hi);
+ br(Assembler::LE, L_success);
+ } else if (lo != min_jint) {
+ subsw(rtmp, rval, lo);
+ br(Assembler::GE, L_success);
+ } else if (hi != max_jint) {
+ subsw(rtmp, rval, hi);
+ br(Assembler::LE, L_success);
+ } else {
+ ShouldNotReachHere();
+ }
+
+ bind(L_failure);
+ movw(c_rarg0, idx);
+ mov(c_rarg1, rval);
+ movw(c_rarg2, lo);
+ movw(c_rarg3, hi);
+ reconstruct_frame_pointer(rtmp);
+ rt_call(CAST_FROM_FN_PTR(address, abort_verify_int_in_range), rtmp);
+ hlt(0);
+
+ bind(L_success);
+ BLOCK_COMMENT("} verify_int_in_range");
+}
+
+static void abort_verify_long_in_range(uint idx, jlong val, jlong lo, jlong hi) {
+ fatal("Invalid CastLL, idx: %u, val: " JLONG_FORMAT ", lo: " JLONG_FORMAT ", hi: " JLONG_FORMAT, idx, val, lo, hi);
+}
+
+void C2_MacroAssembler::verify_long_in_range(uint idx, const TypeLong* t, Register rval, Register rtmp) {
+ assert(!t->empty() && !t->singleton(), "%s", Type::str(t));
+ if (t == TypeLong::LONG) {
+ return;
+ }
+ BLOCK_COMMENT("verify_long_in_range {");
+ Label L_success, L_failure;
+
+ jlong lo = t->_lo;
+ jlong hi = t->_hi;
+
+ if (lo != min_jlong && hi != max_jlong) {
+ subs(rtmp, rval, lo);
+ br(Assembler::LT, L_failure);
+ subs(rtmp, rval, hi);
+ br(Assembler::LE, L_success);
+ } else if (lo != min_jlong) {
+ subs(rtmp, rval, lo);
+ br(Assembler::GE, L_success);
+ } else if (hi != max_jlong) {
+ subs(rtmp, rval, hi);
+ br(Assembler::LE, L_success);
+ } else {
+ ShouldNotReachHere();
+ }
+
+ bind(L_failure);
+ movw(c_rarg0, idx);
+ mov(c_rarg1, rval);
+ mov(c_rarg2, lo);
+ mov(c_rarg3, hi);
+ reconstruct_frame_pointer(rtmp);
+ rt_call(CAST_FROM_FN_PTR(address, abort_verify_long_in_range), rtmp);
+ hlt(0);
+
+ bind(L_success);
+ BLOCK_COMMENT("} verify_long_in_range");
+}
+
+void C2_MacroAssembler::reconstruct_frame_pointer(Register rtmp) {
+ const int framesize = Compile::current()->output()->frame_size_in_bytes();
+ if (PreserveFramePointer) {
+ // frame pointer is valid
+#ifdef ASSERT
+ // Verify frame pointer value in rfp.
+ add(rtmp, sp, framesize - 2 * wordSize);
+ Label L_success;
+ cmp(rfp, rtmp);
+ br(Assembler::EQ, L_success);
+ stop("frame pointer mismatch");
+ bind(L_success);
+#endif // ASSERT
+ } else {
+ add(rfp, sp, framesize - 2 * wordSize);
+ }
+}
diff --git a/src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.hpp b/src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.hpp
index e0eaa0b76e6e9..70e4265c7cc5e 100644
--- a/src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.hpp
+++ b/src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.hpp
@@ -188,4 +188,9 @@
void vector_signum_sve(FloatRegister dst, FloatRegister src, FloatRegister zero,
FloatRegister one, FloatRegister vtmp, PRegister pgtmp, SIMD_RegVariant T);
+ void verify_int_in_range(uint idx, const TypeInt* t, Register val, Register tmp);
+ void verify_long_in_range(uint idx, const TypeLong* t, Register val, Register tmp);
+
+ void reconstruct_frame_pointer(Register rtmp);
+
#endif // CPU_AARCH64_C2_MACROASSEMBLER_AARCH64_HPP
diff --git a/src/hotspot/cpu/aarch64/compressedKlass_aarch64.cpp b/src/hotspot/cpu/aarch64/compressedKlass_aarch64.cpp
index 0c2d9a32c8c13..3874c8cd54ef9 100644
--- a/src/hotspot/cpu/aarch64/compressedKlass_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/compressedKlass_aarch64.cpp
@@ -70,7 +70,7 @@ static char* reserve_at_eor_compatible_address(size_t size, bool aslr) {
const uint64_t immediate = ((uint64_t)immediates[index]) << 32;
assert(immediate > 0 && Assembler::operand_valid_for_logical_immediate(/*is32*/false, immediate),
"Invalid immediate %d " UINT64_FORMAT, index, immediate);
- result = os::attempt_reserve_memory_at((char*)immediate, size, false);
+ result = os::attempt_reserve_memory_at((char*)immediate, size, mtNone);
if (result == nullptr) {
log_trace(metaspace, map)("Failed to attach at " UINT64_FORMAT_X, immediate);
}
@@ -114,7 +114,7 @@ char* CompressedKlassPointers::reserve_address_space_for_compressed_classes(size
if (result == nullptr) {
constexpr size_t alignment = nth_bit(32);
log_debug(metaspace, map)("Trying to reserve at a 32-bit-aligned address");
- result = os::reserve_memory_aligned(size, alignment, false);
+ result = os::reserve_memory_aligned(size, alignment, mtNone);
}
return result;
diff --git a/src/hotspot/cpu/aarch64/gc/shenandoah/c1/shenandoahBarrierSetC1_aarch64.cpp b/src/hotspot/cpu/aarch64/gc/shenandoah/c1/shenandoahBarrierSetC1_aarch64.cpp
index e33ef47cf3c38..e4db8a9ab1f82 100644
--- a/src/hotspot/cpu/aarch64/gc/shenandoah/c1/shenandoahBarrierSetC1_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/gc/shenandoah/c1/shenandoahBarrierSetC1_aarch64.cpp
@@ -27,9 +27,9 @@
#include "c1/c1_MacroAssembler.hpp"
#include "compiler/compilerDefinitions.inline.hpp"
#include "gc/shared/gc_globals.hpp"
+#include "gc/shenandoah/c1/shenandoahBarrierSetC1.hpp"
#include "gc/shenandoah/shenandoahBarrierSet.hpp"
#include "gc/shenandoah/shenandoahBarrierSetAssembler.hpp"
-#include "gc/shenandoah/c1/shenandoahBarrierSetC1.hpp"
#define __ masm->masm()->
diff --git a/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp b/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp
index ac22b43faaf02..a2b3f44c68b72 100644
--- a/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp
@@ -23,6 +23,8 @@
*
*/
+#include "gc/shenandoah/heuristics/shenandoahHeuristics.hpp"
+#include "gc/shenandoah/mode/shenandoahMode.hpp"
#include "gc/shenandoah/shenandoahBarrierSet.hpp"
#include "gc/shenandoah/shenandoahBarrierSetAssembler.hpp"
#include "gc/shenandoah/shenandoahForwarding.hpp"
@@ -30,10 +32,8 @@
#include "gc/shenandoah/shenandoahHeapRegion.hpp"
#include "gc/shenandoah/shenandoahRuntime.hpp"
#include "gc/shenandoah/shenandoahThreadLocalData.hpp"
-#include "gc/shenandoah/heuristics/shenandoahHeuristics.hpp"
-#include "gc/shenandoah/mode/shenandoahMode.hpp"
-#include "interpreter/interpreter.hpp"
#include "interpreter/interp_masm.hpp"
+#include "interpreter/interpreter.hpp"
#include "runtime/javaThread.hpp"
#include "runtime/sharedRuntime.hpp"
#ifdef COMPILER1
diff --git a/src/hotspot/cpu/aarch64/gc/z/zAddress_aarch64.cpp b/src/hotspot/cpu/aarch64/gc/z/zAddress_aarch64.cpp
index a58c91a6a41e1..7008615ed438a 100644
--- a/src/hotspot/cpu/aarch64/gc/z/zAddress_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/gc/z/zAddress_aarch64.cpp
@@ -21,8 +21,8 @@
* questions.
*/
-#include "gc/shared/gcLogPrecious.hpp"
#include "gc/shared/gc_globals.hpp"
+#include "gc/shared/gcLogPrecious.hpp"
#include "gc/z/zAddress.hpp"
#include "gc/z/zBarrierSetAssembler.hpp"
#include "gc/z/zGlobals.hpp"
@@ -95,7 +95,7 @@ size_t ZPlatformAddressOffsetBits() {
static const size_t valid_max_address_offset_bits = probe_valid_max_address_bit() + 1;
const size_t max_address_offset_bits = valid_max_address_offset_bits - 3;
const size_t min_address_offset_bits = max_address_offset_bits - 2;
- const size_t address_offset = round_up_power_of_2(MaxHeapSize * ZVirtualToPhysicalRatio);
+ const size_t address_offset = ZGlobalsPointers::min_address_offset_request();
const size_t address_offset_bits = log2i_exact(address_offset);
return clamp(address_offset_bits, min_address_offset_bits, max_address_offset_bits);
}
diff --git a/src/hotspot/cpu/aarch64/globalDefinitions_aarch64.hpp b/src/hotspot/cpu/aarch64/globalDefinitions_aarch64.hpp
index faf635dc33282..948ba97aa2234 100644
--- a/src/hotspot/cpu/aarch64/globalDefinitions_aarch64.hpp
+++ b/src/hotspot/cpu/aarch64/globalDefinitions_aarch64.hpp
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 1999, 2024, Oracle and/or its affiliates. All rights reserved.
+ * Copyright (c) 1999, 2025, Oracle and/or its affiliates. All rights reserved.
* Copyright (c) 2014, 2015, Red Hat Inc. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
@@ -46,7 +46,7 @@ const bool CCallingConventionRequiresIntsAsLongs = false;
#define DEFAULT_CACHE_LINE_SIZE 64
// The default padding size for data structures to avoid false sharing.
-#define DEFAULT_PADDING_SIZE DEFAULT_CACHE_LINE_SIZE
+#define DEFAULT_PADDING_SIZE (2*DEFAULT_CACHE_LINE_SIZE)
// According to the ARMv8 ARM, "Concurrent modification and execution
// of instructions can lead to the resulting instruction performing
diff --git a/src/hotspot/cpu/aarch64/icache_aarch64.cpp b/src/hotspot/cpu/aarch64/icache_aarch64.cpp
index 311f3a7de1f73..a942406f45ee9 100644
--- a/src/hotspot/cpu/aarch64/icache_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/icache_aarch64.cpp
@@ -31,4 +31,4 @@ void ICacheStubGenerator::generate_icache_flush(
*flush_icache_stub = nullptr;
}
-void ICache::initialize() {}
+void ICache::initialize(int phase) {}
diff --git a/src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp b/src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp
index d5ba85da989dc..dd1d7d1d2e1b4 100644
--- a/src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp
@@ -693,17 +693,18 @@ void InterpreterMacroAssembler::lock_object(Register lock_reg)
// Load object pointer into obj_reg %c_rarg3
ldr(obj_reg, Address(lock_reg, obj_offset));
- if (DiagnoseSyncOnValueBasedClasses != 0) {
- load_klass(tmp, obj_reg);
- ldrb(tmp, Address(tmp, Klass::misc_flags_offset()));
- tst(tmp, KlassFlags::_misc_is_value_based_class);
- br(Assembler::NE, slow_case);
- }
-
if (LockingMode == LM_LIGHTWEIGHT) {
lightweight_lock(lock_reg, obj_reg, tmp, tmp2, tmp3, slow_case);
b(done);
} else if (LockingMode == LM_LEGACY) {
+
+ if (DiagnoseSyncOnValueBasedClasses != 0) {
+ load_klass(tmp, obj_reg);
+ ldrb(tmp, Address(tmp, Klass::misc_flags_offset()));
+ tst(tmp, KlassFlags::_misc_is_value_based_class);
+ br(Assembler::NE, slow_case);
+ }
+
// Load (object->mark() | 1) into swap_reg
ldr(rscratch1, Address(obj_reg, oopDesc::mark_offset_in_bytes()));
orr(swap_reg, rscratch1, 1);
diff --git a/src/hotspot/cpu/aarch64/jvmciCodeInstaller_aarch64.cpp b/src/hotspot/cpu/aarch64/jvmciCodeInstaller_aarch64.cpp
index 3015206dadc4d..071dd2c417992 100644
--- a/src/hotspot/cpu/aarch64/jvmciCodeInstaller_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/jvmciCodeInstaller_aarch64.cpp
@@ -180,6 +180,7 @@ bool CodeInstaller::pd_relocate(address pc, jint mark) {
case POLL_RETURN_FAR:
_instructions->relocate(pc, relocInfo::poll_return_type);
return true;
+#if INCLUDE_ZGC
case Z_BARRIER_RELOCATION_FORMAT_LOAD_GOOD_BEFORE_TB_X:
_instructions->relocate(pc, barrier_Relocation::spec(), ZBarrierRelocationFormatLoadGoodBeforeTbX);
return true;
@@ -192,6 +193,7 @@ bool CodeInstaller::pd_relocate(address pc, jint mark) {
case Z_BARRIER_RELOCATION_FORMAT_STORE_BAD_BEFORE_MOV:
_instructions->relocate(pc, barrier_Relocation::spec(), ZBarrierRelocationFormatStoreBadBeforeMov);
return true;
+#endif
}
return false;
diff --git a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp b/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp
index b6472b1b94812..cf347768de3e1 100644
--- a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp
@@ -675,6 +675,9 @@ void MacroAssembler::set_last_Java_frame(Register last_java_sp,
}
static inline bool target_needs_far_branch(address addr) {
+ if (AOTCodeCache::is_on_for_dump()) {
+ return true;
+ }
// codecache size <= 128M
if (!MacroAssembler::far_branches()) {
return false;
@@ -849,7 +852,7 @@ void MacroAssembler::call_VM_base(Register oop_result,
// get oop result if there is one and reset the value in the thread
if (oop_result->is_valid()) {
- get_vm_result(oop_result, java_thread);
+ get_vm_result_oop(oop_result, java_thread);
}
}
@@ -859,6 +862,9 @@ void MacroAssembler::call_VM_helper(Register oop_result, address entry_point, in
// Check the entry target is always reachable from any branch.
static bool is_always_within_branch_range(Address entry) {
+ if (AOTCodeCache::is_on_for_dump()) {
+ return false;
+ }
const address target = entry.target();
if (!CodeCache::contains(target)) {
@@ -1003,9 +1009,6 @@ void MacroAssembler::c2bool(Register x) {
address MacroAssembler::ic_call(address entry, jint method_index) {
RelocationHolder rh = virtual_call_Relocation::spec(pc(), method_index);
- // address const_ptr = long_constant((jlong)Universe::non_oop_word());
- // uintptr_t offset;
- // ldr_constant(rscratch2, const_ptr);
movptr(rscratch2, (intptr_t)Universe::non_oop_word());
return trampoline_call(Address(entry, rh));
}
@@ -1145,15 +1148,15 @@ void MacroAssembler::call_VM(Register oop_result,
}
-void MacroAssembler::get_vm_result(Register oop_result, Register java_thread) {
- ldr(oop_result, Address(java_thread, JavaThread::vm_result_offset()));
- str(zr, Address(java_thread, JavaThread::vm_result_offset()));
+void MacroAssembler::get_vm_result_oop(Register oop_result, Register java_thread) {
+ ldr(oop_result, Address(java_thread, JavaThread::vm_result_oop_offset()));
+ str(zr, Address(java_thread, JavaThread::vm_result_oop_offset()));
verify_oop_msg(oop_result, "broken oop in call_VM_base");
}
-void MacroAssembler::get_vm_result_2(Register metadata_result, Register java_thread) {
- ldr(metadata_result, Address(java_thread, JavaThread::vm_result_2_offset()));
- str(zr, Address(java_thread, JavaThread::vm_result_2_offset()));
+void MacroAssembler::get_vm_result_metadata(Register metadata_result, Register java_thread) {
+ ldr(metadata_result, Address(java_thread, JavaThread::vm_result_metadata_offset()));
+ str(zr, Address(java_thread, JavaThread::vm_result_metadata_offset()));
}
void MacroAssembler::align(int modulus) {
@@ -2041,7 +2044,7 @@ void MacroAssembler::clinit_barrier(Register klass, Register scratch, Label* L_f
// Fast path check: class is fully initialized
lea(scratch, Address(klass, InstanceKlass::init_state_offset()));
ldarb(scratch, scratch);
- subs(zr, scratch, InstanceKlass::fully_initialized);
+ cmp(scratch, InstanceKlass::fully_initialized);
br(Assembler::EQ, *L_fast_path);
// Fast path check: current thread is initializer thread
@@ -2157,7 +2160,7 @@ void MacroAssembler::call_VM_leaf_base(address entry_point,
stp(rscratch1, rmethod, Address(pre(sp, -2 * wordSize)));
- mov(rscratch1, entry_point);
+ mov(rscratch1, RuntimeAddress(entry_point));
blr(rscratch1);
if (retaddr)
bind(*retaddr);
@@ -3234,9 +3237,13 @@ void MacroAssembler::resolve_global_jobject(Register value, Register tmp1, Regis
}
void MacroAssembler::stop(const char* msg) {
- BLOCK_COMMENT(msg);
+ // Skip AOT caching C strings in scratch buffer.
+ const char* str = (code_section()->scratch_emit()) ? msg : AOTCodeCache::add_C_string(msg);
+ BLOCK_COMMENT(str);
+ // load msg into r0 so we can access it from the signal handler
+ // ExternalAddress enables saving and restoring via the code cache
+ lea(c_rarg0, ExternalAddress((address) str));
dcps1(0xdeae);
- emit_int64((uintptr_t)msg);
}
void MacroAssembler::unimplemented(const char* what) {
@@ -5520,9 +5527,8 @@ void MacroAssembler::movoop(Register dst, jobject obj) {
mov(dst, Address((address)obj, rspec));
} else {
address dummy = address(uintptr_t(pc()) & -wordSize); // A nearby aligned address
- ldr_constant(dst, Address(dummy, rspec));
+ ldr(dst, Address(dummy, rspec));
}
-
}
// Move a metadata address into a register.
@@ -7034,10 +7040,17 @@ void MacroAssembler::lightweight_lock(Register basic_lock, Register obj, Registe
ldr(mark, Address(obj, oopDesc::mark_offset_in_bytes()));
if (UseObjectMonitorTable) {
- // Clear cache in case fast locking succeeds.
+ // Clear cache in case fast locking succeeds or we need to take the slow-path.
str(zr, Address(basic_lock, BasicObjectLock::lock_offset() + in_ByteSize((BasicLock::object_monitor_cache_offset_in_bytes()))));
}
+ if (DiagnoseSyncOnValueBasedClasses != 0) {
+ load_klass(t1, obj);
+ ldrb(t1, Address(t1, Klass::misc_flags_offset()));
+ tst(t1, KlassFlags::_misc_is_value_based_class);
+ br(Assembler::NE, slow);
+ }
+
// Check if the lock-stack is full.
ldrw(top, Address(rthread, JavaThread::lock_stack_top_offset()));
cmpw(top, (unsigned)LockStack::end_offset());
diff --git a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp b/src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp
index bd537af59e471..17ee72a00c0e0 100644
--- a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp
+++ b/src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 1997, 2024, Oracle and/or its affiliates. All rights reserved.
+ * Copyright (c) 1997, 2025, Oracle and/or its affiliates. All rights reserved.
* Copyright (c) 2014, 2024, Red Hat Inc. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
@@ -27,6 +27,7 @@
#define CPU_AARCH64_MACROASSEMBLER_AARCH64_HPP
#include "asm/assembler.inline.hpp"
+#include "code/aotCodeCache.hpp"
#include "code/vmreg.hpp"
#include "metaprogramming/enableIf.hpp"
#include "oops/compressedOops.hpp"
@@ -823,8 +824,8 @@ class MacroAssembler: public Assembler {
Register arg_1, Register arg_2, Register arg_3,
bool check_exceptions = true);
- void get_vm_result (Register oop_result, Register thread);
- void get_vm_result_2(Register metadata_result, Register thread);
+ void get_vm_result_oop(Register oop_result, Register thread);
+ void get_vm_result_metadata(Register metadata_result, Register thread);
// These always tightly bind to MacroAssembler::call_VM_base
// bypassing the virtual implementation
@@ -1315,6 +1316,10 @@ class MacroAssembler: public Assembler {
// Check if branches to the non nmethod section require a far jump
static bool codestub_branch_needs_far_jump() {
+ if (AOTCodeCache::is_on_for_dump()) {
+ // To calculate far_codestub_branch_size correctly.
+ return true;
+ }
return CodeCache::max_distance_to_non_nmethod() > branch_range;
}
@@ -1472,16 +1477,6 @@ class MacroAssembler: public Assembler {
public:
- void ldr_constant(Register dest, const Address &const_addr) {
- if (NearCpool) {
- ldr(dest, const_addr);
- } else {
- uint64_t offset;
- adrp(dest, InternalAddress(const_addr.target()), offset);
- ldr(dest, Address(dest, offset));
- }
- }
-
address read_polling_page(Register r, relocInfo::relocType rtype);
void get_polling_page(Register dest, relocInfo::relocType rtype);
@@ -1611,11 +1606,15 @@ class MacroAssembler: public Assembler {
void aes_round(FloatRegister input, FloatRegister subkey);
// ChaCha20 functions support block
- void cc20_quarter_round(FloatRegister aVec, FloatRegister bVec,
- FloatRegister cVec, FloatRegister dVec, FloatRegister scratch,
- FloatRegister tbl);
- void cc20_shift_lane_org(FloatRegister bVec, FloatRegister cVec,
- FloatRegister dVec, bool colToDiag);
+ void cc20_qr_add4(FloatRegister (&addFirst)[4],
+ FloatRegister (&addSecond)[4]);
+ void cc20_qr_xor4(FloatRegister (&firstElem)[4],
+ FloatRegister (&secondElem)[4], FloatRegister (&result)[4]);
+ void cc20_qr_lrot4(FloatRegister (&sourceReg)[4],
+ FloatRegister (&destReg)[4], int bits, FloatRegister table);
+ void cc20_set_qr_registers(FloatRegister (&vectorSet)[4],
+ const FloatRegister (&stateVectors)[16], int idx1, int idx2,
+ int idx3, int idx4);
// Place an ISB after code may have been modified due to a safepoint.
void safepoint_isb();
diff --git a/src/hotspot/cpu/aarch64/macroAssembler_aarch64_chacha.cpp b/src/hotspot/cpu/aarch64/macroAssembler_aarch64_chacha.cpp
index 1f7bb8f46f64f..083e81af5d969 100644
--- a/src/hotspot/cpu/aarch64/macroAssembler_aarch64_chacha.cpp
+++ b/src/hotspot/cpu/aarch64/macroAssembler_aarch64_chacha.cpp
@@ -28,60 +28,119 @@
#include "runtime/stubRoutines.hpp"
/**
- * Perform the quarter round calculations on values contained within
- * four SIMD registers.
+ * Perform the vectorized add for a group of 4 quarter round operations.
+ * In the ChaCha20 quarter round, there are two add ops: a += b and c += d.
+ * Each parameter is a set of 4 registers representing the 4 registers
+ * for the each addend in the add operation for each of the quarter rounds.
+ * (e.g. for "a" it would consist of v0/v1/v2/v3). The result of the add
+ * is placed into the vectors in the "addFirst" array.
*
- * @param aVec the SIMD register containing only the "a" values
- * @param bVec the SIMD register containing only the "b" values
- * @param cVec the SIMD register containing only the "c" values
- * @param dVec the SIMD register containing only the "d" values
- * @param scratch scratch SIMD register used for 12 and 7 bit left rotations
- * @param table the SIMD register used as a table for 8 bit left rotations
+ * @param addFirst array of SIMD registers representing the first addend.
+ * @param addSecond array of SIMD registers representing the second addend.
*/
-void MacroAssembler::cc20_quarter_round(FloatRegister aVec, FloatRegister bVec,
- FloatRegister cVec, FloatRegister dVec, FloatRegister scratch,
- FloatRegister table) {
+void MacroAssembler::cc20_qr_add4(FloatRegister (&addFirst)[4],
+ FloatRegister (&addSecond)[4]) {
+ for (int i = 0; i < 4; i++) {
+ addv(addFirst[i], T4S, addFirst[i], addSecond[i]);
+ }
+}
+
+
+/**
+ * Perform the vectorized XOR for a group of 4 quarter round operations.
+ * In the ChaCha20 quarter round, there are two XOR ops: d ^= a and b ^= c
+ * Each parameter is a set of 4 registers representing the 4 registers
+ * for the each element in the xor operation for each of the quarter rounds.
+ * (e.g. for "a" it would consist of v0/v1/v2/v3)
+ * Note: because the b ^= c ops precede a non-byte-aligned left-rotation,
+ * there is a third parameter which can take a set of scratch registers
+ * for the result, which facilitates doing the subsequent operations for
+ * the left rotation.
+ *
+ * @param firstElem array of SIMD registers representing the first element.
+ * @param secondElem array of SIMD registers representing the second element.
+ * @param result array of SIMD registers representing the destination.
+ * May be the same as firstElem or secondElem, or a separate array.
+ */
+void MacroAssembler::cc20_qr_xor4(FloatRegister (&firstElem)[4],
+ FloatRegister (&secondElem)[4], FloatRegister (&result)[4]) {
+ for (int i = 0; i < 4; i++) {
+ eor(result[i], T16B, firstElem[i], secondElem[i]);
+ }
+}
+
+/**
+ * Perform the vectorized left-rotation on 32-bit lanes for a group of
+ * 4 quarter round operations.
+ * Each parameter is a set of 4 registers representing the 4 registers
+ * for the each element in the source and destination for each of the quarter
+ * rounds (e.g. for "d" it would consist of v12/v13/v14/v15 on columns and
+ * v15/v12/v13/v14 on diagonal alignments).
+ *
+ * @param sourceReg array of SIMD registers representing the source
+ * @param destReg array of SIMD registers representing the destination
+ * @param bits the distance of the rotation in bits, must be 16/12/8/7 per
+ * the ChaCha20 specification.
+ */
+void MacroAssembler::cc20_qr_lrot4(FloatRegister (&sourceReg)[4],
+ FloatRegister (&destReg)[4], int bits, FloatRegister table) {
+ switch (bits) {
+ case 16: // reg <<<= 16, in-place swap of half-words
+ for (int i = 0; i < 4; i++) {
+ rev32(destReg[i], T8H, sourceReg[i]);
+ }
+ break;
- // a += b, d ^= a, d <<<= 16
- addv(aVec, T4S, aVec, bVec);
- eor(dVec, T16B, dVec, aVec);
- rev32(dVec, T8H, dVec);
+ case 7: // reg <<<= (12 || 7)
+ case 12: // r-shift src -> dest, l-shift src & ins to dest
+ for (int i = 0; i < 4; i++) {
+ ushr(destReg[i], T4S, sourceReg[i], 32 - bits);
+ }
- // c += d, b ^= c, b <<<= 12
- addv(cVec, T4S, cVec, dVec);
- eor(scratch, T16B, bVec, cVec);
- ushr(bVec, T4S, scratch, 20);
- sli(bVec, T4S, scratch, 12);
+ for (int i = 0; i < 4; i++) {
+ sli(destReg[i], T4S, sourceReg[i], bits);
+ }
+ break;
- // a += b, d ^= a, d <<<= 8
- addv(aVec, T4S, aVec, bVec);
- eor(dVec, T16B, dVec, aVec);
- tbl(dVec, T16B, dVec, 1, table);
+ case 8: // reg <<<= 8, simulate left rotation with table reorg
+ for (int i = 0; i < 4; i++) {
+ tbl(destReg[i], T16B, sourceReg[i], 1, table);
+ }
+ break;
- // c += d, b ^= c, b <<<= 7
- addv(cVec, T4S, cVec, dVec);
- eor(scratch, T16B, bVec, cVec);
- ushr(bVec, T4S, scratch, 25);
- sli(bVec, T4S, scratch, 7);
+ default:
+ // The caller shouldn't be sending bit rotation values outside
+ // of the 16/12/8/7 as defined in the specification.
+ ShouldNotReachHere();
+ }
}
/**
- * Shift the b, c, and d vectors between columnar and diagonal representations.
- * Note that the "a" vector does not shift.
+ * Set the FloatRegisters for a 4-vector register set. These will be used
+ * during various quarter round transformations (adds, xors and left-rotations).
+ * This method itself does not result in the output of any assembly
+ * instructions. It just organizes the vectors so they can be in columnar or
+ * diagonal alignments.
*
- * @param bVec the SIMD register containing only the "b" values
- * @param cVec the SIMD register containing only the "c" values
- * @param dVec the SIMD register containing only the "d" values
- * @param colToDiag true if moving columnar to diagonal, false if
- * moving diagonal back to columnar.
+ * @param vectorSet a 4-vector array to be altered into a new alignment
+ * @param stateVectors the 16-vector array that represents the current
+ * working state. The indices of this array match up with the
+ * organization of the ChaCha20 state per RFC 7539 (e.g. stateVectors[12]
+ * would contain the vector that holds the 32-bit counter, etc.)
+ * @param idx1 the index of the stateVectors array to be assigned to the
+ * first vectorSet element.
+ * @param idx2 the index of the stateVectors array to be assigned to the
+ * second vectorSet element.
+ * @param idx3 the index of the stateVectors array to be assigned to the
+ * third vectorSet element.
+ * @param idx4 the index of the stateVectors array to be assigned to the
+ * fourth vectorSet element.
*/
-void MacroAssembler::cc20_shift_lane_org(FloatRegister bVec, FloatRegister cVec,
- FloatRegister dVec, bool colToDiag) {
- int bShift = colToDiag ? 4 : 12;
- int cShift = 8;
- int dShift = colToDiag ? 12 : 4;
-
- ext(bVec, T16B, bVec, bVec, bShift);
- ext(cVec, T16B, cVec, cVec, cShift);
- ext(dVec, T16B, dVec, dVec, dShift);
+void MacroAssembler::cc20_set_qr_registers(FloatRegister (&vectorSet)[4],
+ const FloatRegister (&stateVectors)[16], int idx1, int idx2,
+ int idx3, int idx4) {
+ vectorSet[0] = stateVectors[idx1];
+ vectorSet[1] = stateVectors[idx2];
+ vectorSet[2] = stateVectors[idx3];
+ vectorSet[3] = stateVectors[idx4];
}
diff --git a/src/hotspot/cpu/aarch64/matcher_aarch64.hpp b/src/hotspot/cpu/aarch64/matcher_aarch64.hpp
index a6cd055775870..0fbc2ef141e8b 100644
--- a/src/hotspot/cpu/aarch64/matcher_aarch64.hpp
+++ b/src/hotspot/cpu/aarch64/matcher_aarch64.hpp
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2021, 2024, Oracle and/or its affiliates. All rights reserved.
+ * Copyright (c) 2021, 2025, Oracle and/or its affiliates. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
@@ -200,4 +200,8 @@
return false;
}
+ // Is FEAT_FP16 supported for this CPU?
+ static bool is_feat_fp16_supported() {
+ return (VM_Version::supports_fphp() && VM_Version::supports_asimdhp());
+ }
#endif // CPU_AARCH64_MATCHER_AARCH64_HPP
diff --git a/src/hotspot/cpu/aarch64/methodHandles_aarch64.cpp b/src/hotspot/cpu/aarch64/methodHandles_aarch64.cpp
index 588b8898d2d2a..cdf67e3423f66 100644
--- a/src/hotspot/cpu/aarch64/methodHandles_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/methodHandles_aarch64.cpp
@@ -93,14 +93,60 @@ void MethodHandles::verify_klass(MacroAssembler* _masm,
void MethodHandles::verify_ref_kind(MacroAssembler* _masm, int ref_kind, Register member_reg, Register temp) { }
+void MethodHandles::verify_method(MacroAssembler* _masm, Register method, vmIntrinsics::ID iid) {
+ BLOCK_COMMENT("verify_method {");
+ __ verify_method_ptr(method);
+ if (VerifyMethodHandles) {
+ Label L_ok;
+ assert_different_registers(method, rscratch1, rscratch2);
+ const Register method_holder = rscratch1;
+ __ load_method_holder(method_holder, method);
+
+ switch (iid) {
+ case vmIntrinsicID::_invokeBasic:
+ // Require compiled LambdaForm class to be fully initialized.
+ __ lea(rscratch2, Address(method_holder, InstanceKlass::init_state_offset()));
+ __ ldarb(rscratch2, rscratch2);
+ __ cmp(rscratch2, InstanceKlass::fully_initialized);
+ __ br(Assembler::EQ, L_ok);
+ break;
+
+ case vmIntrinsicID::_linkToStatic:
+ __ clinit_barrier(method_holder, rscratch2, &L_ok);
+ break;
+
+ case vmIntrinsicID::_linkToVirtual:
+ case vmIntrinsicID::_linkToSpecial:
+ case vmIntrinsicID::_linkToInterface:
+ // Class initialization check is too strong here. Just ensure that class initialization has been initiated.
+ __ lea(rscratch2, Address(method_holder, InstanceKlass::init_state_offset()));
+ __ ldarb(rscratch2, rscratch2);
+ __ cmp(rscratch2, InstanceKlass::being_initialized);
+ __ br(Assembler::GE, L_ok);
+
+ // init_state check failed, but it may be an abstract interface method
+ __ ldrh(rscratch2, Address(method, Method::access_flags_offset()));
+ __ tbnz(rscratch2, exact_log2(JVM_ACC_ABSTRACT), L_ok);
+ break;
+
+ default:
+ fatal("unexpected intrinsic %d: %s", vmIntrinsics::as_int(iid), vmIntrinsics::name_at(iid));
+ }
+
+ // Method holder init state check failed for a concrete method.
+ __ stop("Method holder klass is not initialized");
+ __ bind(L_ok);
+ }
+ BLOCK_COMMENT("} verify_method");
+}
#endif //ASSERT
void MethodHandles::jump_from_method_handle(MacroAssembler* _masm, Register method, Register temp,
- bool for_compiler_entry) {
+ bool for_compiler_entry, vmIntrinsics::ID iid) {
assert(method == rmethod, "interpreter calling convention");
Label L_no_such_method;
__ cbz(rmethod, L_no_such_method);
- __ verify_method_ptr(method);
+ verify_method(_masm, method, iid);
if (!for_compiler_entry && JvmtiExport::can_post_interpreter_events()) {
Label run_compiled_code;
@@ -160,7 +206,7 @@ void MethodHandles::jump_to_lambda_form(MacroAssembler* _masm,
__ BIND(L);
}
- jump_from_method_handle(_masm, method_temp, temp2, for_compiler_entry);
+ jump_from_method_handle(_masm, method_temp, temp2, for_compiler_entry, vmIntrinsics::_invokeBasic);
BLOCK_COMMENT("} jump_to_lambda_form");
}
@@ -447,8 +493,7 @@ void MethodHandles::generate_method_handle_dispatch(MacroAssembler* _masm,
// After figuring out which concrete method to call, jump into it.
// Note that this works in the interpreter with no data motion.
// But the compiled version will require that r2_recv be shifted out.
- __ verify_method_ptr(rmethod);
- jump_from_method_handle(_masm, rmethod, temp1, for_compiler_entry);
+ jump_from_method_handle(_masm, rmethod, temp1, for_compiler_entry, iid);
if (iid == vmIntrinsics::_linkToInterface) {
__ bind(L_incompatible_class_change_error);
__ far_jump(RuntimeAddress(SharedRuntime::throw_IncompatibleClassChangeError_entry()));
diff --git a/src/hotspot/cpu/aarch64/methodHandles_aarch64.hpp b/src/hotspot/cpu/aarch64/methodHandles_aarch64.hpp
index bd36f3e84c29a..e82f4d6237ea1 100644
--- a/src/hotspot/cpu/aarch64/methodHandles_aarch64.hpp
+++ b/src/hotspot/cpu/aarch64/methodHandles_aarch64.hpp
@@ -39,6 +39,8 @@ enum /* platform_dependent_constants */ {
Register obj, vmClassID klass_id,
const char* error_message = "wrong klass") NOT_DEBUG_RETURN;
+ static void verify_method(MacroAssembler* _masm, Register method, vmIntrinsics::ID iid) NOT_DEBUG_RETURN;
+
static void verify_method_handle(MacroAssembler* _masm, Register mh_reg) {
verify_klass(_masm, mh_reg, VM_CLASS_ID(java_lang_invoke_MethodHandle),
"reference is a MH");
@@ -49,7 +51,7 @@ enum /* platform_dependent_constants */ {
// Similar to InterpreterMacroAssembler::jump_from_interpreted.
// Takes care of special dispatch from single stepping too.
static void jump_from_method_handle(MacroAssembler* _masm, Register method, Register temp,
- bool for_compiler_entry);
+ bool for_compiler_entry, vmIntrinsics::ID iid);
static void jump_to_lambda_form(MacroAssembler* _masm,
Register recv, Register method_temp,
diff --git a/src/hotspot/cpu/aarch64/register_aarch64.cpp b/src/hotspot/cpu/aarch64/register_aarch64.cpp
index 349845154e2fe..82683daae4f08 100644
--- a/src/hotspot/cpu/aarch64/register_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/register_aarch64.cpp
@@ -58,23 +58,3 @@ const char* PRegister::PRegisterImpl::name() const {
};
return is_valid() ? names[encoding()] : "pnoreg";
}
-
-// convenience methods for splitting 8-way vector register sequences
-// in half -- needed because vector operations can normally only be
-// benefit from 4-way instruction parallelism
-
-VSeq<4> vs_front(const VSeq<8>& v) {
- return VSeq<4>(v.base(), v.delta());
-}
-
-VSeq<4> vs_back(const VSeq<8>& v) {
- return VSeq<4>(v.base() + 4 * v.delta(), v.delta());
-}
-
-VSeq<4> vs_even(const VSeq<8>& v) {
- return VSeq<4>(v.base(), v.delta() * 2);
-}
-
-VSeq<4> vs_odd(const VSeq<8>& v) {
- return VSeq<4>(v.base() + 1, v.delta() * 2);
-}
diff --git a/src/hotspot/cpu/aarch64/register_aarch64.hpp b/src/hotspot/cpu/aarch64/register_aarch64.hpp
index 45578336cfeaa..108f0f34140b4 100644
--- a/src/hotspot/cpu/aarch64/register_aarch64.hpp
+++ b/src/hotspot/cpu/aarch64/register_aarch64.hpp
@@ -436,19 +436,20 @@ enum RC { rc_bad, rc_int, rc_float, rc_predicate, rc_stack };
// inputs into front and back halves or odd and even halves (see
// convenience methods below).
+// helper macro for computing register masks
+#define VS_MASK_BIT(base, delta, i) (1 << (base + delta * i))
+
template class VSeq {
static_assert(N >= 2, "vector sequence length must be greater than 1");
- static_assert(N <= 8, "vector sequence length must not exceed 8");
- static_assert((N & (N - 1)) == 0, "vector sequence length must be power of two");
private:
int _base; // index of first register in sequence
int _delta; // increment to derive successive indices
public:
VSeq(FloatRegister base_reg, int delta = 1) : VSeq(base_reg->encoding(), delta) { }
VSeq(int base, int delta = 1) : _base(base), _delta(delta) {
- assert (_base >= 0, "invalid base register");
- assert (_delta >= 0, "invalid register delta");
- assert ((_base + (N - 1) * _delta) < 32, "range exceeded");
+ assert (_base >= 0 && _base <= 31, "invalid base register");
+ assert ((_base + (N - 1) * _delta) >= 0, "register range underflow");
+ assert ((_base + (N - 1) * _delta) < 32, "register range overflow");
}
// indexed access to sequence
FloatRegister operator [](int i) const {
@@ -457,27 +458,89 @@ template class VSeq {
}
int mask() const {
int m = 0;
- int bit = 1 << _base;
for (int i = 0; i < N; i++) {
- m |= bit << (i * _delta);
+ m |= VS_MASK_BIT(_base, _delta, i);
}
return m;
}
int base() const { return _base; }
int delta() const { return _delta; }
+ bool is_constant() const { return _delta == 0; }
};
-// declare convenience methods for splitting vector register sequences
-
-VSeq<4> vs_front(const VSeq<8>& v);
-VSeq<4> vs_back(const VSeq<8>& v);
-VSeq<4> vs_even(const VSeq<8>& v);
-VSeq<4> vs_odd(const VSeq<8>& v);
-
-// methods for use in asserts to check VSeq inputs and oupts are
+// methods for use in asserts to check VSeq inputs and outputs are
// either disjoint or equal
template bool vs_disjoint(const VSeq& n, const VSeq& m) { return (n.mask() & m.mask()) == 0; }
template bool vs_same(const VSeq& n, const VSeq& m) { return n.mask() == m.mask(); }
+// method for use in asserts to check whether registers appearing in
+// an output sequence will be written before they are read from an
+// input sequence.
+
+template bool vs_write_before_read(const VSeq& vout, const VSeq& vin) {
+ int b_in = vin.base();
+ int d_in = vin.delta();
+ int b_out = vout.base();
+ int d_out = vout.delta();
+ int bit_in = 1 << b_in;
+ int bit_out = 1 << b_out;
+ int mask_read = vin.mask(); // all pending reads
+ int mask_write = 0; // no writes as yet
+
+
+ for (int i = 0; i < N - 1; i++) {
+ // check whether a pending read clashes with a write
+ if ((mask_write & mask_read) != 0) {
+ return true;
+ }
+ // remove the pending input (so long as this is a constant
+ // sequence)
+ if (d_in != 0) {
+ mask_read ^= VS_MASK_BIT(b_in, d_in, i);
+ }
+ // record the next write
+ mask_write |= VS_MASK_BIT(b_out, d_out, i);
+ }
+ // no write before read
+ return false;
+}
+
+// convenience methods for splitting 8-way or 4-way vector register
+// sequences in half -- needed because vector operations can normally
+// benefit from 4-way instruction parallelism or, occasionally, 2-way
+// parallelism
+
+template
+VSeq vs_front(const VSeq& v) {
+ static_assert(N > 0 && ((N & 1) == 0), "sequence length must be even");
+ return VSeq(v.base(), v.delta());
+}
+
+template
+VSeq vs_back(const VSeq& v) {
+ static_assert(N > 0 && ((N & 1) == 0), "sequence length must be even");
+ return VSeq(v.base() + N / 2 * v.delta(), v.delta());
+}
+
+template
+VSeq vs_even(const VSeq& v) {
+ static_assert(N > 0 && ((N & 1) == 0), "sequence length must be even");
+ return VSeq(v.base(), v.delta() * 2);
+}
+
+template
+VSeq vs_odd(const VSeq& v) {
+ static_assert(N > 0 && ((N & 1) == 0), "sequence length must be even");
+ return VSeq(v.base() + v.delta(), v.delta() * 2);
+}
+
+// convenience method to construct a vector register sequence that
+// indexes its elements in reverse order to the original
+
+template
+VSeq vs_reverse(const VSeq& v) {
+ return VSeq(v.base() + (N - 1) * v.delta(), -v.delta());
+}
+
#endif // CPU_AARCH64_REGISTER_AARCH64_HPP
diff --git a/src/hotspot/cpu/aarch64/runtime_aarch64.cpp b/src/hotspot/cpu/aarch64/runtime_aarch64.cpp
index 83e43c3ebd235..2361d584f4252 100644
--- a/src/hotspot/cpu/aarch64/runtime_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/runtime_aarch64.cpp
@@ -65,6 +65,9 @@ UncommonTrapBlob* OptoRuntime::generate_uncommon_trap_blob() {
// Setup code generation tools
const char* name = OptoRuntime::stub_name(OptoStubId::uncommon_trap_id);
CodeBuffer buffer(name, 2048, 1024);
+ if (buffer.blob() == nullptr) {
+ return nullptr;
+ }
MacroAssembler* masm = new MacroAssembler(&buffer);
assert(SimpleRuntimeFrame::framesize % 4 == 0, "sp not 16-byte aligned");
@@ -285,6 +288,9 @@ ExceptionBlob* OptoRuntime::generate_exception_blob() {
// Setup code generation tools
const char* name = OptoRuntime::stub_name(OptoStubId::exception_id);
CodeBuffer buffer(name, 2048, 1024);
+ if (buffer.blob() == nullptr) {
+ return nullptr;
+ }
MacroAssembler* masm = new MacroAssembler(&buffer);
// TODO check various assumptions made here
diff --git a/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp b/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp
index b0b299876018a..0c3dfabc93e88 100644
--- a/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp
@@ -557,40 +557,6 @@ void SharedRuntime::gen_i2c_adapter(MacroAssembler *masm,
// If this happens, control eventually transfers back to the compiled
// caller, but with an uncorrected stack, causing delayed havoc.
- if (VerifyAdapterCalls &&
- (Interpreter::code() != nullptr || StubRoutines::final_stubs_code() != nullptr)) {
-#if 0
- // So, let's test for cascading c2i/i2c adapters right now.
- // assert(Interpreter::contains($return_addr) ||
- // StubRoutines::contains($return_addr),
- // "i2c adapter must return to an interpreter frame");
- __ block_comment("verify_i2c { ");
- Label L_ok;
- if (Interpreter::code() != nullptr) {
- range_check(masm, rax, r11,
- Interpreter::code()->code_start(), Interpreter::code()->code_end(),
- L_ok);
- }
- if (StubRoutines::initial_stubs_code() != nullptr) {
- range_check(masm, rax, r11,
- StubRoutines::initial_stubs_code()->code_begin(),
- StubRoutines::initial_stubs_code()->code_end(),
- L_ok);
- }
- if (StubRoutines::final_stubs_code() != nullptr) {
- range_check(masm, rax, r11,
- StubRoutines::final_stubs_code()->code_begin(),
- StubRoutines::final_stubs_code()->code_end(),
- L_ok);
- }
- const char* msg = "i2c adapter must return to an interpreter frame";
- __ block_comment(msg);
- __ stop(msg);
- __ bind(L_ok);
- __ block_comment("} verify_i2ce ");
-#endif
- }
-
// Cut-out for having no stack args.
int comp_words_on_stack = align_up(comp_args_on_stack*VMRegImpl::stack_slot_size, wordSize)>>LogBytesPerWord;
if (comp_args_on_stack) {
@@ -711,12 +677,12 @@ void SharedRuntime::gen_i2c_adapter(MacroAssembler *masm,
}
// ---------------------------------------------------------------
-AdapterHandlerEntry* SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm,
- int total_args_passed,
- int comp_args_on_stack,
- const BasicType *sig_bt,
- const VMRegPair *regs,
- AdapterFingerPrint* fingerprint) {
+void SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm,
+ int total_args_passed,
+ int comp_args_on_stack,
+ const BasicType *sig_bt,
+ const VMRegPair *regs,
+ AdapterHandlerEntry* handler) {
address i2c_entry = __ pc();
gen_i2c_adapter(masm, total_args_passed, comp_args_on_stack, sig_bt, regs);
@@ -777,7 +743,8 @@ AdapterHandlerEntry* SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm
gen_c2i_adapter(masm, total_args_passed, comp_args_on_stack, sig_bt, regs, skip_fixup);
- return AdapterHandlerLibrary::new_entry(fingerprint, i2c_entry, c2i_entry, c2i_unverified_entry, c2i_no_clinit_check_entry);
+ handler->set_entry_points(i2c_entry, c2i_entry, c2i_unverified_entry, c2i_no_clinit_check_entry);
+ return;
}
static int c_calling_convention_priv(const BasicType *sig_bt,
@@ -2783,7 +2750,7 @@ RuntimeStub* SharedRuntime::generate_resolve_blob(SharedStubId id, address desti
__ cbnz(rscratch1, pending);
// get the returned Method*
- __ get_vm_result_2(rmethod, rthread);
+ __ get_vm_result_metadata(rmethod, rthread);
__ str(rmethod, Address(sp, reg_save.reg_offset_in_bytes(rmethod)));
// r0 is where we want to jump, overwrite rscratch1 which is saved and scratch
@@ -2802,7 +2769,7 @@ RuntimeStub* SharedRuntime::generate_resolve_blob(SharedStubId id, address desti
// exception pending => remove activation and forward to exception handler
- __ str(zr, Address(rthread, JavaThread::vm_result_offset()));
+ __ str(zr, Address(rthread, JavaThread::vm_result_oop_offset()));
__ ldr(r0, Address(rthread, Thread::pending_exception_offset()));
__ far_jump(RuntimeAddress(StubRoutines::forward_exception_entry()));
diff --git a/src/hotspot/cpu/aarch64/stubDeclarations_aarch64.hpp b/src/hotspot/cpu/aarch64/stubDeclarations_aarch64.hpp
index a893aacaaf2dd..425146e6bf46d 100644
--- a/src/hotspot/cpu/aarch64/stubDeclarations_aarch64.hpp
+++ b/src/hotspot/cpu/aarch64/stubDeclarations_aarch64.hpp
@@ -44,7 +44,7 @@
do_arch_blob, \
do_arch_entry, \
do_arch_entry_init) \
- do_arch_blob(compiler, 55000 ZGC_ONLY(+5000)) \
+ do_arch_blob(compiler, 70000) \
do_stub(compiler, vector_iota_indices) \
do_arch_entry(aarch64, compiler, vector_iota_indices, \
vector_iota_indices, vector_iota_indices) \
diff --git a/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp b/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp
index f0f145e3d7612..f5567dcc03ac5 100644
--- a/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp
@@ -4405,89 +4405,44 @@ class StubGenerator: public StubCodeGenerator {
return start;
}
- /**
- * Arguments:
- *
- * Inputs:
- * c_rarg0 - int crc
- * c_rarg1 - byte* buf
- * c_rarg2 - int length
- *
- * Output:
- * rax - int crc result
- */
- address generate_updateBytesCRC32() {
- assert(UseCRC32Intrinsics, "what are we doing here?");
-
- __ align(CodeEntryAlignment);
- StubGenStubId stub_id = StubGenStubId::updateBytesCRC32_id;
- StubCodeMark mark(this, stub_id);
-
- address start = __ pc();
-
- const Register crc = c_rarg0; // crc
- const Register buf = c_rarg1; // source java byte array address
- const Register len = c_rarg2; // length
- const Register table0 = c_rarg3; // crc_table address
- const Register table1 = c_rarg4;
- const Register table2 = c_rarg5;
- const Register table3 = c_rarg6;
- const Register tmp3 = c_rarg7;
-
- BLOCK_COMMENT("Entry:");
- __ enter(); // required for proper stackwalking of RuntimeStub frame
-
- __ kernel_crc32(crc, buf, len,
- table0, table1, table2, table3, rscratch1, rscratch2, tmp3);
-
- __ leave(); // required for proper stackwalking of RuntimeStub frame
- __ ret(lr);
-
- return start;
- }
-
- // ChaCha20 block function. This version parallelizes 4 quarter
- // round operations at a time. It uses 16 SIMD registers to
- // produce 4 blocks of key stream.
+ // ChaCha20 block function. This version parallelizes the 32-bit
+ // state elements on each of 16 vectors, producing 4 blocks of
+ // keystream at a time.
//
// state (int[16]) = c_rarg0
// keystream (byte[256]) = c_rarg1
- // return - number of bytes of keystream (always 256)
+ // return - number of bytes of produced keystream (always 256)
//
- // In this approach, we load the 512-bit start state sequentially into
- // 4 128-bit vectors. We then make 4 4-vector copies of that starting
- // state, with each successive set of 4 vectors having a +1 added into
- // the first 32-bit lane of the 4th vector in that group (the counter).
- // By doing this, we can perform the block function on 4 512-bit blocks
- // within one run of this intrinsic.
- // The alignment of the data across the 4-vector group is such that at
- // the start it is already aligned for the first round of each two-round
- // loop iteration. In other words, the corresponding lanes of each vector
- // will contain the values needed for that quarter round operation (e.g.
- // elements 0/4/8/12, 1/5/9/13, 2/6/10/14, etc.).
- // In between each full round, a lane shift must occur. Within a loop
- // iteration, between the first and second rounds, the 2nd, 3rd, and 4th
- // vectors are rotated left 32, 64 and 96 bits, respectively. The result
- // is effectively a diagonal orientation in columnar form. After the
- // second full round, those registers are left-rotated again, this time
- // 96, 64, and 32 bits - returning the vectors to their columnar organization.
- // After all 10 iterations, the original state is added to each 4-vector
- // working state along with the add mask, and the 4 vector groups are
- // sequentially written to the memory dedicated for the output key stream.
- //
- // For a more detailed explanation, see Goll and Gueron, "Vectorization of
- // ChaCha Stream Cipher", 2014 11th Int. Conf. on Information Technology:
- // New Generations, Las Vegas, NV, USA, April 2014, DOI: 10.1109/ITNG.2014.33
- address generate_chacha20Block_qrpar() {
- Label L_Q_twoRounds, L_Q_cc20_const;
+ // This implementation takes each 32-bit integer from the state
+ // array and broadcasts it across all 4 32-bit lanes of a vector register
+ // (e.g. state[0] is replicated on all 4 lanes of v4, state[1] to all 4 lanes
+ // of v5, etc.). Once all 16 elements have been broadcast onto 16 vectors,
+ // the quarter round schedule is implemented as outlined in RFC 7539 section
+ // 2.3. However, instead of sequentially processing the 3 quarter round
+ // operations represented by one QUARTERROUND function, we instead stack all
+ // the adds, xors and left-rotations from the first 4 quarter rounds together
+ // and then do the same for the second set of 4 quarter rounds. This removes
+ // some latency that would otherwise be incurred by waiting for an add to
+ // complete before performing an xor (which depends on the result of the
+ // add), etc. An adjustment happens between the first and second groups of 4
+ // quarter rounds, but this is done only in the inputs to the macro functions
+ // that generate the assembly instructions - these adjustments themselves are
+ // not part of the resulting assembly.
+ // The 4 registers v0-v3 are used during the quarter round operations as
+ // scratch registers. Once the 20 rounds are complete, these 4 scratch
+ // registers become the vectors involved in adding the start state back onto
+ // the post-QR working state. After the adds are complete, each of the 16
+ // vectors write their first lane back to the keystream buffer, followed
+ // by the second lane from all vectors and so on.
+ address generate_chacha20Block_blockpar() {
+ Label L_twoRounds, L_cc20_const;
// The constant data is broken into two 128-bit segments to be loaded
- // onto SIMD registers. The first 128 bits are a counter add overlay
- // that adds +1/+0/+0/+0 to the vectors holding replicated state[12].
+ // onto FloatRegisters. The first 128 bits are a counter add overlay
+ // that adds +0/+1/+2/+3 to the vector holding replicated state[12].
// The second 128-bits is a table constant used for 8-bit left rotations.
- // on 32-bit lanes within a SIMD register.
- __ BIND(L_Q_cc20_const);
- __ emit_int64(0x0000000000000001UL);
- __ emit_int64(0x0000000000000000UL);
+ __ BIND(L_cc20_const);
+ __ emit_int64(0x0000000100000000UL);
+ __ emit_int64(0x0000000300000002UL);
__ emit_int64(0x0605040702010003UL);
__ emit_int64(0x0E0D0C0F0A09080BUL);
@@ -4497,144 +4452,142 @@ class StubGenerator: public StubCodeGenerator {
address start = __ pc();
__ enter();
+ int i, j;
const Register state = c_rarg0;
const Register keystream = c_rarg1;
const Register loopCtr = r10;
const Register tmpAddr = r11;
+ const FloatRegister ctrAddOverlay = v28;
+ const FloatRegister lrot8Tbl = v29;
+
+ // Organize SIMD registers in an array that facilitates
+ // putting repetitive opcodes into loop structures. It is
+ // important that each grouping of 4 registers is monotonically
+ // increasing to support the requirements of multi-register
+ // instructions (e.g. ld4r, st4, etc.)
+ const FloatRegister workSt[16] = {
+ v4, v5, v6, v7, v16, v17, v18, v19,
+ v20, v21, v22, v23, v24, v25, v26, v27
+ };
- const FloatRegister aState = v0;
- const FloatRegister bState = v1;
- const FloatRegister cState = v2;
- const FloatRegister dState = v3;
- const FloatRegister a1Vec = v4;
- const FloatRegister b1Vec = v5;
- const FloatRegister c1Vec = v6;
- const FloatRegister d1Vec = v7;
- // Skip the callee-saved registers v8 - v15
- const FloatRegister a2Vec = v16;
- const FloatRegister b2Vec = v17;
- const FloatRegister c2Vec = v18;
- const FloatRegister d2Vec = v19;
- const FloatRegister a3Vec = v20;
- const FloatRegister b3Vec = v21;
- const FloatRegister c3Vec = v22;
- const FloatRegister d3Vec = v23;
- const FloatRegister a4Vec = v24;
- const FloatRegister b4Vec = v25;
- const FloatRegister c4Vec = v26;
- const FloatRegister d4Vec = v27;
- const FloatRegister scratch = v28;
- const FloatRegister addMask = v29;
- const FloatRegister lrot8Tbl = v30;
-
- // Load the initial state in the first 4 quadword registers,
- // then copy the initial state into the next 4 quadword registers
- // that will be used for the working state.
- __ ld1(aState, bState, cState, dState, __ T16B, Address(state));
-
- // Load the index register for 2 constant 128-bit data fields.
- // The first represents the +1/+0/+0/+0 add mask. The second is
- // the 8-bit left rotation.
- __ adr(tmpAddr, L_Q_cc20_const);
- __ ldpq(addMask, lrot8Tbl, Address(tmpAddr));
-
- __ mov(a1Vec, __ T16B, aState);
- __ mov(b1Vec, __ T16B, bState);
- __ mov(c1Vec, __ T16B, cState);
- __ mov(d1Vec, __ T16B, dState);
-
- __ mov(a2Vec, __ T16B, aState);
- __ mov(b2Vec, __ T16B, bState);
- __ mov(c2Vec, __ T16B, cState);
- __ addv(d2Vec, __ T4S, d1Vec, addMask);
-
- __ mov(a3Vec, __ T16B, aState);
- __ mov(b3Vec, __ T16B, bState);
- __ mov(c3Vec, __ T16B, cState);
- __ addv(d3Vec, __ T4S, d2Vec, addMask);
-
- __ mov(a4Vec, __ T16B, aState);
- __ mov(b4Vec, __ T16B, bState);
- __ mov(c4Vec, __ T16B, cState);
- __ addv(d4Vec, __ T4S, d3Vec, addMask);
-
- // Set up the 10 iteration loop
+ // Pull in constant data. The first 16 bytes are the add overlay
+ // which is applied to the vector holding the counter (state[12]).
+ // The second 16 bytes is the index register for the 8-bit left
+ // rotation tbl instruction.
+ __ adr(tmpAddr, L_cc20_const);
+ __ ldpq(ctrAddOverlay, lrot8Tbl, Address(tmpAddr));
+
+ // Load from memory and interlace across 16 SIMD registers,
+ // With each word from memory being broadcast to all lanes of
+ // each successive SIMD register.
+ // Addr(0) -> All lanes in workSt[i]
+ // Addr(4) -> All lanes workSt[i + 1], etc.
+ __ mov(tmpAddr, state);
+ for (i = 0; i < 16; i += 4) {
+ __ ld4r(workSt[i], workSt[i + 1], workSt[i + 2], workSt[i + 3], __ T4S,
+ __ post(tmpAddr, 16));
+ }
+ __ addv(workSt[12], __ T4S, workSt[12], ctrAddOverlay); // Add ctr overlay
+
+ // Before entering the loop, create 5 4-register arrays. These
+ // will hold the 4 registers that represent the a/b/c/d fields
+ // in the quarter round operation. For instance the "b" field
+ // for the first 4 quarter round operations is the set of v16/v17/v18/v19,
+ // but in the second 4 quarter rounds it gets adjusted to v17/v18/v19/v16
+ // since it is part of a diagonal organization. The aSet and scratch
+ // register sets are defined at declaration time because they do not change
+ // organization at any point during the 20-round processing.
+ FloatRegister aSet[4] = { v4, v5, v6, v7 };
+ FloatRegister bSet[4];
+ FloatRegister cSet[4];
+ FloatRegister dSet[4];
+ FloatRegister scratch[4] = { v0, v1, v2, v3 };
+
+ // Set up the 10 iteration loop and perform all 8 quarter round ops
__ mov(loopCtr, 10);
- __ BIND(L_Q_twoRounds);
-
- // The first set of operations on the vectors covers the first 4 quarter
- // round operations:
- // Qround(state, 0, 4, 8,12)
- // Qround(state, 1, 5, 9,13)
- // Qround(state, 2, 6,10,14)
- // Qround(state, 3, 7,11,15)
- __ cc20_quarter_round(a1Vec, b1Vec, c1Vec, d1Vec, scratch, lrot8Tbl);
- __ cc20_quarter_round(a2Vec, b2Vec, c2Vec, d2Vec, scratch, lrot8Tbl);
- __ cc20_quarter_round(a3Vec, b3Vec, c3Vec, d3Vec, scratch, lrot8Tbl);
- __ cc20_quarter_round(a4Vec, b4Vec, c4Vec, d4Vec, scratch, lrot8Tbl);
-
- // Shuffle the b1Vec/c1Vec/d1Vec to reorganize the state vectors to
- // diagonals. The a1Vec does not need to change orientation.
- __ cc20_shift_lane_org(b1Vec, c1Vec, d1Vec, true);
- __ cc20_shift_lane_org(b2Vec, c2Vec, d2Vec, true);
- __ cc20_shift_lane_org(b3Vec, c3Vec, d3Vec, true);
- __ cc20_shift_lane_org(b4Vec, c4Vec, d4Vec, true);
-
- // The second set of operations on the vectors covers the second 4 quarter
- // round operations, now acting on the diagonals:
- // Qround(state, 0, 5,10,15)
- // Qround(state, 1, 6,11,12)
- // Qround(state, 2, 7, 8,13)
- // Qround(state, 3, 4, 9,14)
- __ cc20_quarter_round(a1Vec, b1Vec, c1Vec, d1Vec, scratch, lrot8Tbl);
- __ cc20_quarter_round(a2Vec, b2Vec, c2Vec, d2Vec, scratch, lrot8Tbl);
- __ cc20_quarter_round(a3Vec, b3Vec, c3Vec, d3Vec, scratch, lrot8Tbl);
- __ cc20_quarter_round(a4Vec, b4Vec, c4Vec, d4Vec, scratch, lrot8Tbl);
-
- // Before we start the next iteration, we need to perform shuffles
- // on the b/c/d vectors to move them back to columnar organizations
- // from their current diagonal orientation.
- __ cc20_shift_lane_org(b1Vec, c1Vec, d1Vec, false);
- __ cc20_shift_lane_org(b2Vec, c2Vec, d2Vec, false);
- __ cc20_shift_lane_org(b3Vec, c3Vec, d3Vec, false);
- __ cc20_shift_lane_org(b4Vec, c4Vec, d4Vec, false);
+ __ BIND(L_twoRounds);
+
+ // Set to columnar organization and do the following 4 quarter-rounds:
+ // QUARTERROUND(0, 4, 8, 12)
+ // QUARTERROUND(1, 5, 9, 13)
+ // QUARTERROUND(2, 6, 10, 14)
+ // QUARTERROUND(3, 7, 11, 15)
+ __ cc20_set_qr_registers(bSet, workSt, 4, 5, 6, 7);
+ __ cc20_set_qr_registers(cSet, workSt, 8, 9, 10, 11);
+ __ cc20_set_qr_registers(dSet, workSt, 12, 13, 14, 15);
+
+ __ cc20_qr_add4(aSet, bSet); // a += b
+ __ cc20_qr_xor4(dSet, aSet, dSet); // d ^= a
+ __ cc20_qr_lrot4(dSet, dSet, 16, lrot8Tbl); // d <<<= 16
+
+ __ cc20_qr_add4(cSet, dSet); // c += d
+ __ cc20_qr_xor4(bSet, cSet, scratch); // b ^= c (scratch)
+ __ cc20_qr_lrot4(scratch, bSet, 12, lrot8Tbl); // b <<<= 12
+
+ __ cc20_qr_add4(aSet, bSet); // a += b
+ __ cc20_qr_xor4(dSet, aSet, dSet); // d ^= a
+ __ cc20_qr_lrot4(dSet, dSet, 8, lrot8Tbl); // d <<<= 8
+
+ __ cc20_qr_add4(cSet, dSet); // c += d
+ __ cc20_qr_xor4(bSet, cSet, scratch); // b ^= c (scratch)
+ __ cc20_qr_lrot4(scratch, bSet, 7, lrot8Tbl); // b <<<= 12
+
+ // Set to diagonal organization and do the next 4 quarter-rounds:
+ // QUARTERROUND(0, 5, 10, 15)
+ // QUARTERROUND(1, 6, 11, 12)
+ // QUARTERROUND(2, 7, 8, 13)
+ // QUARTERROUND(3, 4, 9, 14)
+ __ cc20_set_qr_registers(bSet, workSt, 5, 6, 7, 4);
+ __ cc20_set_qr_registers(cSet, workSt, 10, 11, 8, 9);
+ __ cc20_set_qr_registers(dSet, workSt, 15, 12, 13, 14);
+
+ __ cc20_qr_add4(aSet, bSet); // a += b
+ __ cc20_qr_xor4(dSet, aSet, dSet); // d ^= a
+ __ cc20_qr_lrot4(dSet, dSet, 16, lrot8Tbl); // d <<<= 16
+
+ __ cc20_qr_add4(cSet, dSet); // c += d
+ __ cc20_qr_xor4(bSet, cSet, scratch); // b ^= c (scratch)
+ __ cc20_qr_lrot4(scratch, bSet, 12, lrot8Tbl); // b <<<= 12
+
+ __ cc20_qr_add4(aSet, bSet); // a += b
+ __ cc20_qr_xor4(dSet, aSet, dSet); // d ^= a
+ __ cc20_qr_lrot4(dSet, dSet, 8, lrot8Tbl); // d <<<= 8
+
+ __ cc20_qr_add4(cSet, dSet); // c += d
+ __ cc20_qr_xor4(bSet, cSet, scratch); // b ^= c (scratch)
+ __ cc20_qr_lrot4(scratch, bSet, 7, lrot8Tbl); // b <<<= 12
// Decrement and iterate
__ sub(loopCtr, loopCtr, 1);
- __ cbnz(loopCtr, L_Q_twoRounds);
-
- // Once the counter reaches zero, we fall out of the loop
- // and need to add the initial state back into the working state
- // represented by the a/b/c/d1Vec registers. This is destructive
- // on the dState register but we no longer will need it.
- __ addv(a1Vec, __ T4S, a1Vec, aState);
- __ addv(b1Vec, __ T4S, b1Vec, bState);
- __ addv(c1Vec, __ T4S, c1Vec, cState);
- __ addv(d1Vec, __ T4S, d1Vec, dState);
-
- __ addv(a2Vec, __ T4S, a2Vec, aState);
- __ addv(b2Vec, __ T4S, b2Vec, bState);
- __ addv(c2Vec, __ T4S, c2Vec, cState);
- __ addv(dState, __ T4S, dState, addMask);
- __ addv(d2Vec, __ T4S, d2Vec, dState);
-
- __ addv(a3Vec, __ T4S, a3Vec, aState);
- __ addv(b3Vec, __ T4S, b3Vec, bState);
- __ addv(c3Vec, __ T4S, c3Vec, cState);
- __ addv(dState, __ T4S, dState, addMask);
- __ addv(d3Vec, __ T4S, d3Vec, dState);
-
- __ addv(a4Vec, __ T4S, a4Vec, aState);
- __ addv(b4Vec, __ T4S, b4Vec, bState);
- __ addv(c4Vec, __ T4S, c4Vec, cState);
- __ addv(dState, __ T4S, dState, addMask);
- __ addv(d4Vec, __ T4S, d4Vec, dState);
-
- // Write the final state back to the result buffer
- __ st1(a1Vec, b1Vec, c1Vec, d1Vec, __ T16B, __ post(keystream, 64));
- __ st1(a2Vec, b2Vec, c2Vec, d2Vec, __ T16B, __ post(keystream, 64));
- __ st1(a3Vec, b3Vec, c3Vec, d3Vec, __ T16B, __ post(keystream, 64));
- __ st1(a4Vec, b4Vec, c4Vec, d4Vec, __ T16B, __ post(keystream, 64));
+ __ cbnz(loopCtr, L_twoRounds);
+
+ __ mov(tmpAddr, state);
+
+ // Add the starting state back to the post-loop keystream
+ // state. We read/interlace the state array from memory into
+ // 4 registers similar to what we did in the beginning. Then
+ // add the counter overlay onto workSt[12] at the end.
+ for (i = 0; i < 16; i += 4) {
+ __ ld4r(v0, v1, v2, v3, __ T4S, __ post(tmpAddr, 16));
+ __ addv(workSt[i], __ T4S, workSt[i], v0);
+ __ addv(workSt[i + 1], __ T4S, workSt[i + 1], v1);
+ __ addv(workSt[i + 2], __ T4S, workSt[i + 2], v2);
+ __ addv(workSt[i + 3], __ T4S, workSt[i + 3], v3);
+ }
+ __ addv(workSt[12], __ T4S, workSt[12], ctrAddOverlay); // Add ctr overlay
+
+ // Write working state into the keystream buffer. This is accomplished
+ // by taking the lane "i" from each of the four vectors and writing
+ // it to consecutive 4-byte offsets, then post-incrementing by 16 and
+ // repeating with the next 4 vectors until all 16 vectors have been used.
+ // Then move to the next lane and repeat the process until all lanes have
+ // been written.
+ for (i = 0; i < 4; i++) {
+ for (j = 0; j < 16; j += 4) {
+ __ st4(workSt[j], workSt[j + 1], workSt[j + 2], workSt[j + 3], __ S, i,
+ __ post(keystream, 16));
+ }
+ }
__ mov(r0, 256); // Return length of output keystream
__ leave();
@@ -4651,6 +4604,11 @@ class StubGenerator: public StubCodeGenerator {
template
void vs_addv(const VSeq& v, Assembler::SIMD_Arrangement T,
const VSeq& v1, const VSeq& v2) {
+ // output must not be constant
+ assert(N == 1 || !v.is_constant(), "cannot output multiple values to a constant vector");
+ // output cannot overwrite pending inputs
+ assert(!vs_write_before_read(v, v1), "output overwrites input");
+ assert(!vs_write_before_read(v, v2), "output overwrites input");
for (int i = 0; i < N; i++) {
__ addv(v[i], T, v1[i], v2[i]);
}
@@ -4659,6 +4617,11 @@ class StubGenerator: public StubCodeGenerator {
template
void vs_subv(const VSeq& v, Assembler::SIMD_Arrangement T,
const VSeq& v1, const VSeq& v2) {
+ // output must not be constant
+ assert(N == 1 || !v.is_constant(), "cannot output multiple values to a constant vector");
+ // output cannot overwrite pending inputs
+ assert(!vs_write_before_read(v, v1), "output overwrites input");
+ assert(!vs_write_before_read(v, v2), "output overwrites input");
for (int i = 0; i < N; i++) {
__ subv(v[i], T, v1[i], v2[i]);
}
@@ -4667,6 +4630,11 @@ class StubGenerator: public StubCodeGenerator {
template
void vs_mulv(const VSeq& v, Assembler::SIMD_Arrangement T,
const VSeq& v1, const VSeq& v2) {
+ // output must not be constant
+ assert(N == 1 || !v.is_constant(), "cannot output multiple values to a constant vector");
+ // output cannot overwrite pending inputs
+ assert(!vs_write_before_read(v, v1), "output overwrites input");
+ assert(!vs_write_before_read(v, v2), "output overwrites input");
for (int i = 0; i < N; i++) {
__ mulv(v[i], T, v1[i], v2[i]);
}
@@ -4674,6 +4642,10 @@ class StubGenerator: public StubCodeGenerator {
template
void vs_negr(const VSeq& v, Assembler::SIMD_Arrangement T, const VSeq& v1) {
+ // output must not be constant
+ assert(N == 1 || !v.is_constant(), "cannot output multiple values to a constant vector");
+ // output cannot overwrite pending inputs
+ assert(!vs_write_before_read(v, v1), "output overwrites input");
for (int i = 0; i < N; i++) {
__ negr(v[i], T, v1[i]);
}
@@ -4682,6 +4654,10 @@ class StubGenerator: public StubCodeGenerator {
template
void vs_sshr(const VSeq& v, Assembler::SIMD_Arrangement T,
const VSeq& v1, int shift) {
+ // output must not be constant
+ assert(N == 1 || !v.is_constant(), "cannot output multiple values to a constant vector");
+ // output cannot overwrite pending inputs
+ assert(!vs_write_before_read(v, v1), "output overwrites input");
for (int i = 0; i < N; i++) {
__ sshr(v[i], T, v1[i], shift);
}
@@ -4689,6 +4665,11 @@ class StubGenerator: public StubCodeGenerator {
template
void vs_andr(const VSeq& v, const VSeq& v1, const VSeq& v2) {
+ // output must not be constant
+ assert(N == 1 || !v.is_constant(), "cannot output multiple values to a constant vector");
+ // output cannot overwrite pending inputs
+ assert(!vs_write_before_read(v, v1), "output overwrites input");
+ assert(!vs_write_before_read(v, v2), "output overwrites input");
for (int i = 0; i < N; i++) {
__ andr(v[i], __ T16B, v1[i], v2[i]);
}
@@ -4696,18 +4677,51 @@ class StubGenerator: public StubCodeGenerator {
template
void vs_orr(const VSeq& v, const VSeq& v1, const VSeq& v2) {
+ // output must not be constant
+ assert(N == 1 || !v.is_constant(), "cannot output multiple values to a constant vector");
+ // output cannot overwrite pending inputs
+ assert(!vs_write_before_read(v, v1), "output overwrites input");
+ assert(!vs_write_before_read(v, v2), "output overwrites input");
for (int i = 0; i < N; i++) {
__ orr(v[i], __ T16B, v1[i], v2[i]);
}
}
template
- void vs_notr(const VSeq& v, const VSeq& v1) {
+ void vs_notr(const VSeq& v, const VSeq& v1) {
+ // output must not be constant
+ assert(N == 1 || !v.is_constant(), "cannot output multiple values to a constant vector");
+ // output cannot overwrite pending inputs
+ assert(!vs_write_before_read(v, v1), "output overwrites input");
for (int i = 0; i < N; i++) {
__ notr(v[i], __ T16B, v1[i]);
}
}
+ template
+ void vs_sqdmulh(const VSeq& v, Assembler::SIMD_Arrangement T, const VSeq& v1, const VSeq& v2) {
+ // output must not be constant
+ assert(N == 1 || !v.is_constant(), "cannot output multiple values to a constant vector");
+ // output cannot overwrite pending inputs
+ assert(!vs_write_before_read(v, v1), "output overwrites input");
+ assert(!vs_write_before_read(v, v2), "output overwrites input");
+ for (int i = 0; i < N; i++) {
+ __ sqdmulh(v[i], T, v1[i], v2[i]);
+ }
+ }
+
+ template
+ void vs_mlsv(const VSeq& v, Assembler::SIMD_Arrangement T, const VSeq& v1, VSeq& v2) {
+ // output must not be constant
+ assert(N == 1 || !v.is_constant(), "cannot output multiple values to a constant vector");
+ // output cannot overwrite pending inputs
+ assert(!vs_write_before_read(v, v1), "output overwrites input");
+ assert(!vs_write_before_read(v, v2), "output overwrites input");
+ for (int i = 0; i < N; i++) {
+ __ mlsv(v[i], T, v1[i], v2[i]);
+ }
+ }
+
// load N/2 successive pairs of quadword values from memory in order
// into N successive vector registers of the sequence via the
// address supplied in base.
@@ -4723,6 +4737,7 @@ class StubGenerator: public StubCodeGenerator {
// in base using post-increment addressing
template
void vs_ldpq_post(const VSeq& v, Register base) {
+ static_assert((N & (N - 1)) == 0, "sequence length must be even");
for (int i = 0; i < N; i += 2) {
__ ldpq(v[i], v[i+1], __ post(base, 32));
}
@@ -4733,11 +4748,55 @@ class StubGenerator: public StubCodeGenerator {
// supplied in base using post-increment addressing
template
void vs_stpq_post(const VSeq& v, Register base) {
+ static_assert((N & (N - 1)) == 0, "sequence length must be even");
for (int i = 0; i < N; i += 2) {
__ stpq(v[i], v[i+1], __ post(base, 32));
}
}
+ // load N/2 pairs of quadword values from memory de-interleaved into
+ // N vector registers 2 at a time via the address supplied in base
+ // using post-increment addressing.
+ template
+ void vs_ld2_post(const VSeq& v, Assembler::SIMD_Arrangement T, Register base) {
+ static_assert((N & (N - 1)) == 0, "sequence length must be even");
+ for (int i = 0; i < N; i += 2) {
+ __ ld2(v[i], v[i+1], T, __ post(base, 32));
+ }
+ }
+
+ // store N vector registers interleaved into N/2 pairs of quadword
+ // memory locations via the address supplied in base using
+ // post-increment addressing.
+ template
+ void vs_st2_post(const VSeq& v, Assembler::SIMD_Arrangement T, Register base) {
+ static_assert((N & (N - 1)) == 0, "sequence length must be even");
+ for (int i = 0; i < N; i += 2) {
+ __ st2(v[i], v[i+1], T, __ post(base, 32));
+ }
+ }
+
+ // load N quadword values from memory de-interleaved into N vector
+ // registers 3 elements at a time via the address supplied in base.
+ template
+ void vs_ld3(const VSeq& v, Assembler::SIMD_Arrangement T, Register base) {
+ static_assert(N == ((N / 3) * 3), "sequence length must be multiple of 3");
+ for (int i = 0; i < N; i += 3) {
+ __ ld3(v[i], v[i+1], v[i+2], T, base);
+ }
+ }
+
+ // load N quadword values from memory de-interleaved into N vector
+ // registers 3 elements at a time via the address supplied in base
+ // using post-increment addressing.
+ template
+ void vs_ld3_post(const VSeq& v, Assembler::SIMD_Arrangement T, Register base) {
+ static_assert(N == ((N / 3) * 3), "sequence length must be multiple of 3");
+ for (int i = 0; i < N; i += 3) {
+ __ ld3(v[i], v[i+1], v[i+2], T, __ post(base, 48));
+ }
+ }
+
// load N/2 pairs of quadword values from memory into N vector
// registers via the address supplied in base with each pair indexed
// using the the start offset plus the corresponding entry in the
@@ -4810,23 +4869,29 @@ class StubGenerator: public StubCodeGenerator {
}
}
- // Helper routines for various flavours of dilithium montgomery
- // multiply
+ // Helper routines for various flavours of Montgomery multiply
- // Perform 16 32-bit Montgomery multiplications in parallel
- // See the montMul() method of the sun.security.provider.ML_DSA class.
+ // Perform 16 32-bit (4x4S) or 32 16-bit (4 x 8H) Montgomery
+ // multiplications in parallel
+ //
+
+ // See the montMul() method of the sun.security.provider.ML_DSA
+ // class.
//
- // Computes 4x4S results
- // a = b * c * 2^-32 mod MONT_Q
- // Inputs: vb, vc - 4x4S vector register sequences
- // vq - 2x4S constants
- // Temps: vtmp - 4x4S vector sequence trashed after call
- // Outputs: va - 4x4S vector register sequences
+ // Computes 4x4S results or 8x8H results
+ // a = b * c * 2^MONT_R_BITS mod MONT_Q
+ // Inputs: vb, vc - 4x4S or 4x8H vector register sequences
+ // vq - 2x4S or 2x8H constants
+ // Temps: vtmp - 4x4S or 4x8H vector sequence trashed after call
+ // Outputs: va - 4x4S or 4x8H vector register sequences
// vb, vc, vtmp and vq must all be disjoint
// va must be disjoint from all other inputs/temps or must equal vc
- // n.b. MONT_R_BITS is 32, so the right shift by it is implicit.
- void dilithium_montmul16(const VSeq<4>& va, const VSeq<4>& vb, const VSeq<4>& vc,
- const VSeq<4>& vtmp, const VSeq<2>& vq) {
+ // va must have a non-zero delta i.e. it must not be a constant vseq.
+ // n.b. MONT_R_BITS is 16 or 32, so the right shift by it is implicit.
+ void vs_montmul4(const VSeq<4>& va, const VSeq<4>& vb, const VSeq<4>& vc,
+ Assembler::SIMD_Arrangement T,
+ const VSeq<4>& vtmp, const VSeq<2>& vq) {
+ assert (T == __ T4S || T == __ T8H, "invalid arrangement for montmul");
assert(vs_disjoint(vb, vc), "vb and vc overlap");
assert(vs_disjoint(vb, vq), "vb and vq overlap");
assert(vs_disjoint(vb, vtmp), "vb and vtmp overlap");
@@ -4840,40 +4905,107 @@ class StubGenerator: public StubCodeGenerator {
assert(vs_disjoint(va, vb), "va and vb overlap");
assert(vs_disjoint(va, vq), "va and vq overlap");
assert(vs_disjoint(va, vtmp), "va and vtmp overlap");
+ assert(!va.is_constant(), "output vector must identify 4 different registers");
// schedule 4 streams of instructions across the vector sequences
for (int i = 0; i < 4; i++) {
- __ sqdmulh(vtmp[i], __ T4S, vb[i], vc[i]); // aHigh = hi32(2 * b * c)
- __ mulv(va[i], __ T4S, vb[i], vc[i]); // aLow = lo32(b * c)
+ __ sqdmulh(vtmp[i], T, vb[i], vc[i]); // aHigh = hi32(2 * b * c)
+ __ mulv(va[i], T, vb[i], vc[i]); // aLow = lo32(b * c)
}
for (int i = 0; i < 4; i++) {
- __ mulv(va[i], __ T4S, va[i], vq[0]); // m = aLow * qinv
+ __ mulv(va[i], T, va[i], vq[0]); // m = aLow * qinv
}
for (int i = 0; i < 4; i++) {
- __ sqdmulh(va[i], __ T4S, va[i], vq[1]); // n = hi32(2 * m * q)
+ __ sqdmulh(va[i], T, va[i], vq[1]); // n = hi32(2 * m * q)
}
for (int i = 0; i < 4; i++) {
- __ shsubv(va[i], __ T4S, vtmp[i], va[i]); // a = (aHigh - n) / 2
+ __ shsubv(va[i], T, vtmp[i], va[i]); // a = (aHigh - n) / 2
}
}
- // Perform 2x16 32-bit Montgomery multiplications in parallel
- // See the montMul() method of the sun.security.provider.ML_DSA class.
+ // Perform 8 32-bit (4x4S) or 16 16-bit (2 x 8H) Montgomery
+ // multiplications in parallel
+ //
+
+ // See the montMul() method of the sun.security.provider.ML_DSA
+ // class.
//
- // Computes 8x4S results
- // a = b * c * 2^-32 mod MONT_Q
- // Inputs: vb, vc - 8x4S vector register sequences
- // vq - 2x4S constants
- // Temps: vtmp - 4x4S vector sequence trashed after call
- // Outputs: va - 8x4S vector register sequences
+ // Computes 4x4S results or 8x8H results
+ // a = b * c * 2^MONT_R_BITS mod MONT_Q
+ // Inputs: vb, vc - 4x4S or 4x8H vector register sequences
+ // vq - 2x4S or 2x8H constants
+ // Temps: vtmp - 4x4S or 4x8H vector sequence trashed after call
+ // Outputs: va - 4x4S or 4x8H vector register sequences
// vb, vc, vtmp and vq must all be disjoint
// va must be disjoint from all other inputs/temps or must equal vc
- // n.b. MONT_R_BITS is 32, so the right shift by it is implicit.
- void vs_montmul32(const VSeq<8>& va, const VSeq<8>& vb, const VSeq<8>& vc,
- const VSeq<4>& vtmp, const VSeq<2>& vq) {
+ // va must have a non-zero delta i.e. it must not be a constant vseq.
+ // n.b. MONT_R_BITS is 16 or 32, so the right shift by it is implicit.
+ void vs_montmul2(const VSeq<2>& va, const VSeq<2>& vb, const VSeq<2>& vc,
+ Assembler::SIMD_Arrangement T,
+ const VSeq<2>& vtmp, const VSeq<2>& vq) {
+ assert (T == __ T4S || T == __ T8H, "invalid arrangement for montmul");
+ assert(vs_disjoint(vb, vc), "vb and vc overlap");
+ assert(vs_disjoint(vb, vq), "vb and vq overlap");
+ assert(vs_disjoint(vb, vtmp), "vb and vtmp overlap");
+
+ assert(vs_disjoint(vc, vq), "vc and vq overlap");
+ assert(vs_disjoint(vc, vtmp), "vc and vtmp overlap");
+
+ assert(vs_disjoint(vq, vtmp), "vq and vtmp overlap");
+
+ assert(vs_disjoint(va, vc) || vs_same(va, vc), "va and vc neither disjoint nor equal");
+ assert(vs_disjoint(va, vb), "va and vb overlap");
+ assert(vs_disjoint(va, vq), "va and vq overlap");
+ assert(vs_disjoint(va, vtmp), "va and vtmp overlap");
+ assert(!va.is_constant(), "output vector must identify 2 different registers");
+
+ // schedule 2 streams of instructions across the vector sequences
+ for (int i = 0; i < 2; i++) {
+ __ sqdmulh(vtmp[i], T, vb[i], vc[i]); // aHigh = hi32(2 * b * c)
+ __ mulv(va[i], T, vb[i], vc[i]); // aLow = lo32(b * c)
+ }
+
+ for (int i = 0; i < 2; i++) {
+ __ mulv(va[i], T, va[i], vq[0]); // m = aLow * qinv
+ }
+
+ for (int i = 0; i < 2; i++) {
+ __ sqdmulh(va[i], T, va[i], vq[1]); // n = hi32(2 * m * q)
+ }
+
+ for (int i = 0; i < 2; i++) {
+ __ shsubv(va[i], T, vtmp[i], va[i]); // a = (aHigh - n) / 2
+ }
+ }
+
+ // Perform 16 16-bit Montgomery multiplications in parallel.
+ void kyber_montmul16(const VSeq<2>& va, const VSeq<2>& vb, const VSeq<2>& vc,
+ const VSeq<2>& vtmp, const VSeq<2>& vq) {
+ // Use the helper routine to schedule a 2x8H Montgomery multiply.
+ // It will assert that the register use is valid
+ vs_montmul2(va, vb, vc, __ T8H, vtmp, vq);
+ }
+
+ // Perform 32 16-bit Montgomery multiplications in parallel.
+ void kyber_montmul32(const VSeq<4>& va, const VSeq<4>& vb, const VSeq<4>& vc,
+ const VSeq<4>& vtmp, const VSeq<2>& vq) {
+ // Use the helper routine to schedule a 4x8H Montgomery multiply.
+ // It will assert that the register use is valid
+ vs_montmul4(va, vb, vc, __ T8H, vtmp, vq);
+ }
+
+ // Perform 64 16-bit Montgomery multiplications in parallel.
+ void kyber_montmul64(const VSeq<8>& va, const VSeq<8>& vb, const VSeq<8>& vc,
+ const VSeq<4>& vtmp, const VSeq<2>& vq) {
+ // Schedule two successive 4x8H multiplies via the montmul helper
+ // on the front and back halves of va, vb and vc. The helper will
+ // assert that the register use has no overlap conflicts on each
+ // individual call but we also need to ensure that the necessary
+ // disjoint/equality constraints are met across both calls.
+
// vb, vc, vtmp and vq must be disjoint. va must either be
// disjoint from all other registers or equal vc
@@ -4891,8 +5023,8 @@ class StubGenerator: public StubCodeGenerator {
assert(vs_disjoint(va, vq), "va and vq overlap");
assert(vs_disjoint(va, vtmp), "va and vtmp overlap");
- // we need to multiply the front and back halves of each sequence
- // 4x4S at a time because
+ // we multiply the front and back halves of each sequence 4 at a
+ // time because
//
// 1) we are currently only able to get 4-way instruction
// parallelism at best
@@ -4901,14 +5033,1236 @@ class StubGenerator: public StubCodeGenerator {
// scratch registers to hold intermediate results so vtmp can only
// be a VSeq<4> which means we only have 4 scratch slots
- dilithium_montmul16(vs_front(va), vs_front(vb), vs_front(vc), vtmp, vq);
- dilithium_montmul16(vs_back(va), vs_back(vb), vs_back(vc), vtmp, vq);
+ vs_montmul4(vs_front(va), vs_front(vb), vs_front(vc), __ T8H, vtmp, vq);
+ vs_montmul4(vs_back(va), vs_back(vb), vs_back(vc), __ T8H, vtmp, vq);
+ }
+
+ void kyber_montmul32_sub_add(const VSeq<4>& va0, const VSeq<4>& va1,
+ const VSeq<4>& vc,
+ const VSeq<4>& vtmp,
+ const VSeq<2>& vq) {
+ // compute a = montmul(a1, c)
+ kyber_montmul32(vc, va1, vc, vtmp, vq);
+ // ouptut a1 = a0 - a
+ vs_subv(va1, __ T8H, va0, vc);
+ // and a0 = a0 + a
+ vs_addv(va0, __ T8H, va0, vc);
+ }
+
+ void kyber_sub_add_montmul32(const VSeq<4>& va0, const VSeq<4>& va1,
+ const VSeq<4>& vb,
+ const VSeq<4>& vtmp1,
+ const VSeq<4>& vtmp2,
+ const VSeq<2>& vq) {
+ // compute c = a0 - a1
+ vs_subv(vtmp1, __ T8H, va0, va1);
+ // output a0 = a0 + a1
+ vs_addv(va0, __ T8H, va0, va1);
+ // output a1 = b montmul c
+ kyber_montmul32(va1, vtmp1, vb, vtmp2, vq);
+ }
+
+ void load64shorts(const VSeq<8>& v, Register shorts) {
+ vs_ldpq_post(v, shorts);
+ }
+
+ void load32shorts(const VSeq<4>& v, Register shorts) {
+ vs_ldpq_post(v, shorts);
+ }
+
+ void store64shorts(VSeq<8> v, Register tmpAddr) {
+ vs_stpq_post(v, tmpAddr);
+ }
+
+ // Kyber NTT function.
+ // Implements
+ // static int implKyberNtt(short[] poly, short[] ntt_zetas) {}
+ //
+ // coeffs (short[256]) = c_rarg0
+ // ntt_zetas (short[256]) = c_rarg1
+ address generate_kyberNtt() {
+
+ __ align(CodeEntryAlignment);
+ StubGenStubId stub_id = StubGenStubId::kyberNtt_id;
+ StubCodeMark mark(this, stub_id);
+ address start = __ pc();
+ __ enter();
+
+ const Register coeffs = c_rarg0;
+ const Register zetas = c_rarg1;
+
+ const Register kyberConsts = r10;
+ const Register tmpAddr = r11;
+
+ VSeq<8> vs1(0), vs2(16), vs3(24); // 3 sets of 8x8H inputs/outputs
+ VSeq<4> vtmp = vs_front(vs3); // n.b. tmp registers overlap vs3
+ VSeq<2> vq(30); // n.b. constants overlap vs3
+
+ __ lea(kyberConsts, ExternalAddress((address) StubRoutines::aarch64::_kyberConsts));
+ // load the montmul constants
+ vs_ldpq(vq, kyberConsts);
+
+ // Each level corresponds to an iteration of the outermost loop of the
+ // Java method seilerNTT(int[] coeffs). There are some differences
+ // from what is done in the seilerNTT() method, though:
+ // 1. The computation is using 16-bit signed values, we do not convert them
+ // to ints here.
+ // 2. The zetas are delivered in a bigger array, 128 zetas are stored in
+ // this array for each level, it is easier that way to fill up the vector
+ // registers.
+ // 3. In the seilerNTT() method we use R = 2^20 for the Montgomery
+ // multiplications (this is because that way there should not be any
+ // overflow during the inverse NTT computation), here we usr R = 2^16 so
+ // that we can use the 16-bit arithmetic in the vector unit.
+ //
+ // On each level, we fill up the vector registers in such a way that the
+ // array elements that need to be multiplied by the zetas go into one
+ // set of vector registers while the corresponding ones that don't need to
+ // be multiplied, go into another set.
+ // We can do 32 Montgomery multiplications in parallel, using 12 vector
+ // registers interleaving the steps of 4 identical computations,
+ // each done on 8 16-bit values per register.
+
+ // At levels 0-3 the coefficients multiplied by or added/subtracted
+ // to the zetas occur in discrete blocks whose size is some multiple
+ // of 32.
+
+ // level 0
+ __ add(tmpAddr, coeffs, 256);
+ load64shorts(vs1, tmpAddr);
+ load64shorts(vs2, zetas);
+ kyber_montmul64(vs2, vs1, vs2, vtmp, vq);
+ __ add(tmpAddr, coeffs, 0);
+ load64shorts(vs1, tmpAddr);
+ vs_subv(vs3, __ T8H, vs1, vs2); // n.b. trashes vq
+ vs_addv(vs1, __ T8H, vs1, vs2);
+ __ add(tmpAddr, coeffs, 0);
+ vs_stpq_post(vs1, tmpAddr);
+ __ add(tmpAddr, coeffs, 256);
+ vs_stpq_post(vs3, tmpAddr);
+ // restore montmul constants
+ vs_ldpq(vq, kyberConsts);
+ load64shorts(vs1, tmpAddr);
+ load64shorts(vs2, zetas);
+ kyber_montmul64(vs2, vs1, vs2, vtmp, vq);
+ __ add(tmpAddr, coeffs, 128);
+ load64shorts(vs1, tmpAddr);
+ vs_subv(vs3, __ T8H, vs1, vs2); // n.b. trashes vq
+ vs_addv(vs1, __ T8H, vs1, vs2);
+ __ add(tmpAddr, coeffs, 128);
+ store64shorts(vs1, tmpAddr);
+ __ add(tmpAddr, coeffs, 384);
+ store64shorts(vs3, tmpAddr);
+
+ // level 1
+ // restore montmul constants
+ vs_ldpq(vq, kyberConsts);
+ __ add(tmpAddr, coeffs, 128);
+ load64shorts(vs1, tmpAddr);
+ load64shorts(vs2, zetas);
+ kyber_montmul64(vs2, vs1, vs2, vtmp, vq);
+ __ add(tmpAddr, coeffs, 0);
+ load64shorts(vs1, tmpAddr);
+ vs_subv(vs3, __ T8H, vs1, vs2); // n.b. trashes vq
+ vs_addv(vs1, __ T8H, vs1, vs2);
+ __ add(tmpAddr, coeffs, 0);
+ store64shorts(vs1, tmpAddr);
+ store64shorts(vs3, tmpAddr);
+ vs_ldpq(vq, kyberConsts);
+ __ add(tmpAddr, coeffs, 384);
+ load64shorts(vs1, tmpAddr);
+ load64shorts(vs2, zetas);
+ kyber_montmul64(vs2, vs1, vs2, vtmp, vq);
+ __ add(tmpAddr, coeffs, 256);
+ load64shorts(vs1, tmpAddr);
+ vs_subv(vs3, __ T8H, vs1, vs2); // n.b. trashes vq
+ vs_addv(vs1, __ T8H, vs1, vs2);
+ __ add(tmpAddr, coeffs, 256);
+ store64shorts(vs1, tmpAddr);
+ store64shorts(vs3, tmpAddr);
+
+ // level 2
+ vs_ldpq(vq, kyberConsts);
+ int offsets1[4] = { 0, 32, 128, 160 };
+ vs_ldpq_indexed(vs1, coeffs, 64, offsets1);
+ load64shorts(vs2, zetas);
+ kyber_montmul64(vs2, vs1, vs2, vtmp, vq);
+ vs_ldpq_indexed(vs1, coeffs, 0, offsets1);
+ // kyber_subv_addv64();
+ vs_subv(vs3, __ T8H, vs1, vs2); // n.b. trashes vq
+ vs_addv(vs1, __ T8H, vs1, vs2);
+ __ add(tmpAddr, coeffs, 0);
+ vs_stpq_post(vs_front(vs1), tmpAddr);
+ vs_stpq_post(vs_front(vs3), tmpAddr);
+ vs_stpq_post(vs_back(vs1), tmpAddr);
+ vs_stpq_post(vs_back(vs3), tmpAddr);
+ vs_ldpq(vq, kyberConsts);
+ vs_ldpq_indexed(vs1, tmpAddr, 64, offsets1);
+ load64shorts(vs2, zetas);
+ kyber_montmul64(vs2, vs1, vs2, vtmp, vq);
+ vs_ldpq_indexed(vs1, coeffs, 256, offsets1);
+ // kyber_subv_addv64();
+ vs_subv(vs3, __ T8H, vs1, vs2); // n.b. trashes vq
+ vs_addv(vs1, __ T8H, vs1, vs2);
+ __ add(tmpAddr, coeffs, 256);
+ vs_stpq_post(vs_front(vs1), tmpAddr);
+ vs_stpq_post(vs_front(vs3), tmpAddr);
+ vs_stpq_post(vs_back(vs1), tmpAddr);
+ vs_stpq_post(vs_back(vs3), tmpAddr);
+
+ // level 3
+ vs_ldpq(vq, kyberConsts);
+ int offsets2[4] = { 0, 64, 128, 192 };
+ vs_ldpq_indexed(vs1, coeffs, 32, offsets2);
+ load64shorts(vs2, zetas);
+ kyber_montmul64(vs2, vs1, vs2, vtmp, vq);
+ vs_ldpq_indexed(vs1, coeffs, 0, offsets2);
+ vs_subv(vs3, __ T8H, vs1, vs2); // n.b. trashes vq
+ vs_addv(vs1, __ T8H, vs1, vs2);
+ vs_stpq_indexed(vs1, coeffs, 0, offsets2);
+ vs_stpq_indexed(vs3, coeffs, 32, offsets2);
+
+ vs_ldpq(vq, kyberConsts);
+ vs_ldpq_indexed(vs1, coeffs, 256 + 32, offsets2);
+ load64shorts(vs2, zetas);
+ kyber_montmul64(vs2, vs1, vs2, vtmp, vq);
+ vs_ldpq_indexed(vs1, coeffs, 256, offsets2);
+ vs_subv(vs3, __ T8H, vs1, vs2); // n.b. trashes vq
+ vs_addv(vs1, __ T8H, vs1, vs2);
+ vs_stpq_indexed(vs1, coeffs, 256, offsets2);
+ vs_stpq_indexed(vs3, coeffs, 256 + 32, offsets2);
+
+ // level 4
+ // At level 4 coefficients occur in 8 discrete blocks of size 16
+ // so they are loaded using employing an ldr at 8 distinct offsets.
+
+ vs_ldpq(vq, kyberConsts);
+ int offsets3[8] = { 0, 32, 64, 96, 128, 160, 192, 224 };
+ vs_ldr_indexed(vs1, __ Q, coeffs, 16, offsets3);
+ load64shorts(vs2, zetas);
+ kyber_montmul64(vs2, vs1, vs2, vtmp, vq);
+ vs_ldr_indexed(vs1, __ Q, coeffs, 0, offsets3);
+ vs_subv(vs3, __ T8H, vs1, vs2); // n.b. trashes vq
+ vs_addv(vs1, __ T8H, vs1, vs2);
+ vs_str_indexed(vs1, __ Q, coeffs, 0, offsets3);
+ vs_str_indexed(vs3, __ Q, coeffs, 16, offsets3);
+
+ vs_ldpq(vq, kyberConsts);
+ vs_ldr_indexed(vs1, __ Q, coeffs, 256 + 16, offsets3);
+ load64shorts(vs2, zetas);
+ kyber_montmul64(vs2, vs1, vs2, vtmp, vq);
+ vs_ldr_indexed(vs1, __ Q, coeffs, 256, offsets3);
+ vs_subv(vs3, __ T8H, vs1, vs2); // n.b. trashes vq
+ vs_addv(vs1, __ T8H, vs1, vs2);
+ vs_str_indexed(vs1, __ Q, coeffs, 256, offsets3);
+ vs_str_indexed(vs3, __ Q, coeffs, 256 + 16, offsets3);
+
+ // level 5
+ // At level 5 related coefficients occur in discrete blocks of size 8 so
+ // need to be loaded interleaved using an ld2 operation with arrangement 2D.
+
+ vs_ldpq(vq, kyberConsts);
+ int offsets4[4] = { 0, 32, 64, 96 };
+ vs_ld2_indexed(vs1, __ T2D, coeffs, tmpAddr, 0, offsets4);
+ load32shorts(vs_front(vs2), zetas);
+ kyber_montmul32_sub_add(vs_even(vs1), vs_odd(vs1), vs_front(vs2), vtmp, vq);
+ vs_st2_indexed(vs1, __ T2D, coeffs, tmpAddr, 0, offsets4);
+ vs_ld2_indexed(vs1, __ T2D, coeffs, tmpAddr, 128, offsets4);
+ load32shorts(vs_front(vs2), zetas);
+ kyber_montmul32_sub_add(vs_even(vs1), vs_odd(vs1), vs_front(vs2), vtmp, vq);
+ vs_st2_indexed(vs1, __ T2D, coeffs, tmpAddr, 128, offsets4);
+ vs_ld2_indexed(vs1, __ T2D, coeffs, tmpAddr, 256, offsets4);
+ load32shorts(vs_front(vs2), zetas);
+ kyber_montmul32_sub_add(vs_even(vs1), vs_odd(vs1), vs_front(vs2), vtmp, vq);
+ vs_st2_indexed(vs1, __ T2D, coeffs, tmpAddr, 256, offsets4);
+
+ vs_ld2_indexed(vs1, __ T2D, coeffs, tmpAddr, 384, offsets4);
+ load32shorts(vs_front(vs2), zetas);
+ kyber_montmul32_sub_add(vs_even(vs1), vs_odd(vs1), vs_front(vs2), vtmp, vq);
+ vs_st2_indexed(vs1, __ T2D, coeffs, tmpAddr, 384, offsets4);
+
+ // level 6
+ // At level 6 related coefficients occur in discrete blocks of size 4 so
+ // need to be loaded interleaved using an ld2 operation with arrangement 4S.
+
+ vs_ld2_indexed(vs1, __ T4S, coeffs, tmpAddr, 0, offsets4);
+ load32shorts(vs_front(vs2), zetas);
+ kyber_montmul32_sub_add(vs_even(vs1), vs_odd(vs1), vs_front(vs2), vtmp, vq);
+ vs_st2_indexed(vs1, __ T4S, coeffs, tmpAddr, 0, offsets4);
+ vs_ld2_indexed(vs1, __ T4S, coeffs, tmpAddr, 128, offsets4);
+ // __ ldpq(v18, v19, __ post(zetas, 32));
+ load32shorts(vs_front(vs2), zetas);
+ kyber_montmul32_sub_add(vs_even(vs1), vs_odd(vs1), vs_front(vs2), vtmp, vq);
+ vs_st2_indexed(vs1, __ T4S, coeffs, tmpAddr, 128, offsets4);
+
+ vs_ld2_indexed(vs1, __ T4S, coeffs, tmpAddr, 256, offsets4);
+ load32shorts(vs_front(vs2), zetas);
+ kyber_montmul32_sub_add(vs_even(vs1), vs_odd(vs1), vs_front(vs2), vtmp, vq);
+ vs_st2_indexed(vs1, __ T4S, coeffs, tmpAddr, 256, offsets4);
+
+ vs_ld2_indexed(vs1, __ T4S, coeffs, tmpAddr, 384, offsets4);
+ load32shorts(vs_front(vs2), zetas);
+ kyber_montmul32_sub_add(vs_even(vs1), vs_odd(vs1), vs_front(vs2), vtmp, vq);
+ vs_st2_indexed(vs1, __ T4S, coeffs, tmpAddr, 384, offsets4);
+
+ __ leave(); // required for proper stackwalking of RuntimeStub frame
+ __ mov(r0, zr); // return 0
+ __ ret(lr);
+
+ return start;
+ }
+
+ // Kyber Inverse NTT function
+ // Implements
+ // static int implKyberInverseNtt(short[] poly, short[] zetas) {}
+ //
+ // coeffs (short[256]) = c_rarg0
+ // ntt_zetas (short[256]) = c_rarg1
+ address generate_kyberInverseNtt() {
+
+ __ align(CodeEntryAlignment);
+ StubGenStubId stub_id = StubGenStubId::kyberInverseNtt_id;
+ StubCodeMark mark(this, stub_id);
+ address start = __ pc();
+ __ enter();
+
+ const Register coeffs = c_rarg0;
+ const Register zetas = c_rarg1;
+
+ const Register kyberConsts = r10;
+ const Register tmpAddr = r11;
+ const Register tmpAddr2 = c_rarg2;
+
+ VSeq<8> vs1(0), vs2(16), vs3(24); // 3 sets of 8x8H inputs/outputs
+ VSeq<4> vtmp = vs_front(vs3); // n.b. tmp registers overlap vs3
+ VSeq<2> vq(30); // n.b. constants overlap vs3
+
+ __ lea(kyberConsts,
+ ExternalAddress((address) StubRoutines::aarch64::_kyberConsts));
+
+ // level 0
+ // At level 0 related coefficients occur in discrete blocks of size 4 so
+ // need to be loaded interleaved using an ld2 operation with arrangement 4S.
+
+ vs_ldpq(vq, kyberConsts);
+ int offsets4[4] = { 0, 32, 64, 96 };
+ vs_ld2_indexed(vs1, __ T4S, coeffs, tmpAddr, 0, offsets4);
+ load32shorts(vs_front(vs2), zetas);
+ kyber_sub_add_montmul32(vs_even(vs1), vs_odd(vs1),
+ vs_front(vs2), vs_back(vs2), vtmp, vq);
+ vs_st2_indexed(vs1, __ T4S, coeffs, tmpAddr, 0, offsets4);
+ vs_ld2_indexed(vs1, __ T4S, coeffs, tmpAddr, 128, offsets4);
+ load32shorts(vs_front(vs2), zetas);
+ kyber_sub_add_montmul32(vs_even(vs1), vs_odd(vs1),
+ vs_front(vs2), vs_back(vs2), vtmp, vq);
+ vs_st2_indexed(vs1, __ T4S, coeffs, tmpAddr, 128, offsets4);
+ vs_ld2_indexed(vs1, __ T4S, coeffs, tmpAddr, 256, offsets4);
+ load32shorts(vs_front(vs2), zetas);
+ kyber_sub_add_montmul32(vs_even(vs1), vs_odd(vs1),
+ vs_front(vs2), vs_back(vs2), vtmp, vq);
+ vs_st2_indexed(vs1, __ T4S, coeffs, tmpAddr, 256, offsets4);
+ vs_ld2_indexed(vs1, __ T4S, coeffs, tmpAddr, 384, offsets4);
+ load32shorts(vs_front(vs2), zetas);
+ kyber_sub_add_montmul32(vs_even(vs1), vs_odd(vs1),
+ vs_front(vs2), vs_back(vs2), vtmp, vq);
+ vs_st2_indexed(vs1, __ T4S, coeffs, tmpAddr, 384, offsets4);
+
+ // level 1
+ // At level 1 related coefficients occur in discrete blocks of size 8 so
+ // need to be loaded interleaved using an ld2 operation with arrangement 2D.
+
+ vs_ld2_indexed(vs1, __ T2D, coeffs, tmpAddr, 0, offsets4);
+ load32shorts(vs_front(vs2), zetas);
+ kyber_sub_add_montmul32(vs_even(vs1), vs_odd(vs1),
+ vs_front(vs2), vs_back(vs2), vtmp, vq);
+ vs_st2_indexed(vs1, __ T2D, coeffs, tmpAddr, 0, offsets4);
+ vs_ld2_indexed(vs1, __ T2D, coeffs, tmpAddr, 128, offsets4);
+ load32shorts(vs_front(vs2), zetas);
+ kyber_sub_add_montmul32(vs_even(vs1), vs_odd(vs1),
+ vs_front(vs2), vs_back(vs2), vtmp, vq);
+ vs_st2_indexed(vs1, __ T2D, coeffs, tmpAddr, 128, offsets4);
+
+ vs_ld2_indexed(vs1, __ T2D, coeffs, tmpAddr, 256, offsets4);
+ load32shorts(vs_front(vs2), zetas);
+ kyber_sub_add_montmul32(vs_even(vs1), vs_odd(vs1),
+ vs_front(vs2), vs_back(vs2), vtmp, vq);
+ vs_st2_indexed(vs1, __ T2D, coeffs, tmpAddr, 256, offsets4);
+ vs_ld2_indexed(vs1, __ T2D, coeffs, tmpAddr, 384, offsets4);
+ load32shorts(vs_front(vs2), zetas);
+ kyber_sub_add_montmul32(vs_even(vs1), vs_odd(vs1),
+ vs_front(vs2), vs_back(vs2), vtmp, vq);
+ vs_st2_indexed(vs1, __ T2D, coeffs, tmpAddr, 384, offsets4);
+
+ // level 2
+ // At level 2 coefficients occur in 8 discrete blocks of size 16
+ // so they are loaded using employing an ldr at 8 distinct offsets.
+
+ int offsets3[8] = { 0, 32, 64, 96, 128, 160, 192, 224 };
+ vs_ldr_indexed(vs1, __ Q, coeffs, 0, offsets3);
+ vs_ldr_indexed(vs2, __ Q, coeffs, 16, offsets3);
+ vs_addv(vs3, __ T8H, vs1, vs2); // n.b. trashes vq
+ vs_subv(vs1, __ T8H, vs1, vs2);
+ vs_str_indexed(vs3, __ Q, coeffs, 0, offsets3);
+ load64shorts(vs2, zetas);
+ vs_ldpq(vq, kyberConsts);
+ kyber_montmul64(vs2, vs1, vs2, vtmp, vq);
+ vs_str_indexed(vs2, __ Q, coeffs, 16, offsets3);
+
+ vs_ldr_indexed(vs1, __ Q, coeffs, 256, offsets3);
+ vs_ldr_indexed(vs2, __ Q, coeffs, 256 + 16, offsets3);
+ vs_addv(vs3, __ T8H, vs1, vs2); // n.b. trashes vq
+ vs_subv(vs1, __ T8H, vs1, vs2);
+ vs_str_indexed(vs3, __ Q, coeffs, 256, offsets3);
+ load64shorts(vs2, zetas);
+ vs_ldpq(vq, kyberConsts);
+ kyber_montmul64(vs2, vs1, vs2, vtmp, vq);
+ vs_str_indexed(vs2, __ Q, coeffs, 256 + 16, offsets3);
+
+ // Barrett reduction at indexes where overflow may happen
+
+ // load q and the multiplier for the Barrett reduction
+ __ add(tmpAddr, kyberConsts, 16);
+ vs_ldpq(vq, tmpAddr);
+
+ VSeq<8> vq1 = VSeq<8>(vq[0], 0); // 2 constant 8 sequences
+ VSeq<8> vq2 = VSeq<8>(vq[1], 0); // for above two kyber constants
+ VSeq<8> vq3 = VSeq<8>(v29, 0); // 3rd sequence for const montmul
+ vs_ldr_indexed(vs1, __ Q, coeffs, 0, offsets3);
+ vs_sqdmulh(vs2, __ T8H, vs1, vq2);
+ vs_sshr(vs2, __ T8H, vs2, 11);
+ vs_mlsv(vs1, __ T8H, vs2, vq1);
+ vs_str_indexed(vs1, __ Q, coeffs, 0, offsets3);
+ vs_ldr_indexed(vs1, __ Q, coeffs, 256, offsets3);
+ vs_sqdmulh(vs2, __ T8H, vs1, vq2);
+ vs_sshr(vs2, __ T8H, vs2, 11);
+ vs_mlsv(vs1, __ T8H, vs2, vq1);
+ vs_str_indexed(vs1, __ Q, coeffs, 256, offsets3);
+
+ // level 3
+ // From level 3 upwards coefficients occur in discrete blocks whose size is
+ // some multiple of 32 so can be loaded using ldpq and suitable indexes.
+
+ int offsets2[4] = { 0, 64, 128, 192 };
+ vs_ldpq_indexed(vs1, coeffs, 0, offsets2);
+ vs_ldpq_indexed(vs2, coeffs, 32, offsets2);
+ vs_addv(vs3, __ T8H, vs1, vs2); // n.b. trashes vq
+ vs_subv(vs1, __ T8H, vs1, vs2);
+ vs_stpq_indexed(vs3, coeffs, 0, offsets2);
+ load64shorts(vs2, zetas);
+ vs_ldpq(vq, kyberConsts);
+ kyber_montmul64(vs2, vs1, vs2, vtmp, vq);
+ vs_stpq_indexed(vs2, coeffs, 32, offsets2);
+
+ vs_ldpq_indexed(vs1, coeffs, 256, offsets2);
+ vs_ldpq_indexed(vs2, coeffs, 256 + 32, offsets2);
+ vs_addv(vs3, __ T8H, vs1, vs2); // n.b. trashes vq
+ vs_subv(vs1, __ T8H, vs1, vs2);
+ vs_stpq_indexed(vs3, coeffs, 256, offsets2);
+ load64shorts(vs2, zetas);
+ vs_ldpq(vq, kyberConsts);
+ kyber_montmul64(vs2, vs1, vs2, vtmp, vq);
+ vs_stpq_indexed(vs2, coeffs, 256 + 32, offsets2);
+
+ // level 4
+
+ int offsets1[4] = { 0, 32, 128, 160 };
+ vs_ldpq_indexed(vs1, coeffs, 0, offsets1);
+ vs_ldpq_indexed(vs2, coeffs, 64, offsets1);
+ vs_addv(vs3, __ T8H, vs1, vs2); // n.b. trashes vq
+ vs_subv(vs1, __ T8H, vs1, vs2);
+ vs_stpq_indexed(vs3, coeffs, 0, offsets1);
+ load64shorts(vs2, zetas);
+ vs_ldpq(vq, kyberConsts);
+ kyber_montmul64(vs2, vs1, vs2, vtmp, vq);
+ vs_stpq_indexed(vs2, coeffs, 64, offsets1);
+
+ vs_ldpq_indexed(vs1, coeffs, 256, offsets1);
+ vs_ldpq_indexed(vs2, coeffs, 256 + 64, offsets1);
+ vs_addv(vs3, __ T8H, vs1, vs2); // n.b. trashes vq
+ vs_subv(vs1, __ T8H, vs1, vs2);
+ vs_stpq_indexed(vs3, coeffs, 256, offsets1);
+ load64shorts(vs2, zetas);
+ vs_ldpq(vq, kyberConsts);
+ kyber_montmul64(vs2, vs1, vs2, vtmp, vq);
+ vs_stpq_indexed(vs2, coeffs, 256 + 64, offsets1);
+
+ // level 5
+
+ __ add(tmpAddr, coeffs, 0);
+ load64shorts(vs1, tmpAddr);
+ __ add(tmpAddr, coeffs, 128);
+ load64shorts(vs2, tmpAddr);
+ vs_addv(vs3, __ T8H, vs1, vs2); // n.b. trashes vq
+ vs_subv(vs1, __ T8H, vs1, vs2);
+ __ add(tmpAddr, coeffs, 0);
+ store64shorts(vs3, tmpAddr);
+ load64shorts(vs2, zetas);
+ vs_ldpq(vq, kyberConsts);
+ kyber_montmul64(vs2, vs1, vs2, vtmp, vq);
+ __ add(tmpAddr, coeffs, 128);
+ store64shorts(vs2, tmpAddr);
+
+ load64shorts(vs1, tmpAddr);
+ __ add(tmpAddr, coeffs, 384);
+ load64shorts(vs2, tmpAddr);
+ vs_addv(vs3, __ T8H, vs1, vs2); // n.b. trashes vq
+ vs_subv(vs1, __ T8H, vs1, vs2);
+ __ add(tmpAddr, coeffs, 256);
+ store64shorts(vs3, tmpAddr);
+ load64shorts(vs2, zetas);
+ vs_ldpq(vq, kyberConsts);
+ kyber_montmul64(vs2, vs1, vs2, vtmp, vq);
+ __ add(tmpAddr, coeffs, 384);
+ store64shorts(vs2, tmpAddr);
+
+ // Barrett reduction at indexes where overflow may happen
+
+ // load q and the multiplier for the Barrett reduction
+ __ add(tmpAddr, kyberConsts, 16);
+ vs_ldpq(vq, tmpAddr);
+
+ int offsets0[2] = { 0, 256 };
+ vs_ldpq_indexed(vs_front(vs1), coeffs, 0, offsets0);
+ vs_sqdmulh(vs2, __ T8H, vs1, vq2);
+ vs_sshr(vs2, __ T8H, vs2, 11);
+ vs_mlsv(vs1, __ T8H, vs2, vq1);
+ vs_stpq_indexed(vs_front(vs1), coeffs, 0, offsets0);
+
+ // level 6
+
+ __ add(tmpAddr, coeffs, 0);
+ load64shorts(vs1, tmpAddr);
+ __ add(tmpAddr, coeffs, 256);
+ load64shorts(vs2, tmpAddr);
+ vs_addv(vs3, __ T8H, vs1, vs2); // n.b. trashes vq
+ vs_subv(vs1, __ T8H, vs1, vs2);
+ __ add(tmpAddr, coeffs, 0);
+ store64shorts(vs3, tmpAddr);
+ load64shorts(vs2, zetas);
+ vs_ldpq(vq, kyberConsts);
+ kyber_montmul64(vs2, vs1, vs2, vtmp, vq);
+ __ add(tmpAddr, coeffs, 256);
+ store64shorts(vs2, tmpAddr);
+
+ __ add(tmpAddr, coeffs, 128);
+ load64shorts(vs1, tmpAddr);
+ __ add(tmpAddr, coeffs, 384);
+ load64shorts(vs2, tmpAddr);
+ vs_addv(vs3, __ T8H, vs1, vs2); // n.b. trashes vq
+ vs_subv(vs1, __ T8H, vs1, vs2);
+ __ add(tmpAddr, coeffs, 128);
+ store64shorts(vs3, tmpAddr);
+ load64shorts(vs2, zetas);
+ vs_ldpq(vq, kyberConsts);
+ kyber_montmul64(vs2, vs1, vs2, vtmp, vq);
+ __ add(tmpAddr, coeffs, 384);
+ store64shorts(vs2, tmpAddr);
+
+ // multiply by 2^-n
+
+ // load toMont(2^-n mod q)
+ __ add(tmpAddr, kyberConsts, 48);
+ __ ldr(v29, __ Q, tmpAddr);
+
+ vs_ldpq(vq, kyberConsts);
+ __ add(tmpAddr, coeffs, 0);
+ load64shorts(vs1, tmpAddr);
+ kyber_montmul64(vs2, vs1, vq3, vtmp, vq);
+ __ add(tmpAddr, coeffs, 0);
+ store64shorts(vs2, tmpAddr);
+
+ // now tmpAddr contains coeffs + 128 because store64shorts adjusted it so
+ load64shorts(vs1, tmpAddr);
+ kyber_montmul64(vs2, vs1, vq3, vtmp, vq);
+ __ add(tmpAddr, coeffs, 128);
+ store64shorts(vs2, tmpAddr);
+
+ // now tmpAddr contains coeffs + 256
+ load64shorts(vs1, tmpAddr);
+ kyber_montmul64(vs2, vs1, vq3, vtmp, vq);
+ __ add(tmpAddr, coeffs, 256);
+ store64shorts(vs2, tmpAddr);
+
+ // now tmpAddr contains coeffs + 384
+ load64shorts(vs1, tmpAddr);
+ kyber_montmul64(vs2, vs1, vq3, vtmp, vq);
+ __ add(tmpAddr, coeffs, 384);
+ store64shorts(vs2, tmpAddr);
+
+ __ leave(); // required for proper stackwalking of RuntimeStub frame
+ __ mov(r0, zr); // return 0
+ __ ret(lr);
+
+ return start;
+ }
+
+ // Kyber multiply polynomials in the NTT domain.
+ // Implements
+ // static int implKyberNttMult(
+ // short[] result, short[] ntta, short[] nttb, short[] zetas) {}
+ //
+ // result (short[256]) = c_rarg0
+ // ntta (short[256]) = c_rarg1
+ // nttb (short[256]) = c_rarg2
+ // zetas (short[128]) = c_rarg3
+ address generate_kyberNttMult() {
+
+ __ align(CodeEntryAlignment);
+ StubGenStubId stub_id = StubGenStubId::kyberNttMult_id;
+ StubCodeMark mark(this, stub_id);
+ address start = __ pc();
+ __ enter();
+
+ const Register result = c_rarg0;
+ const Register ntta = c_rarg1;
+ const Register nttb = c_rarg2;
+ const Register zetas = c_rarg3;
+
+ const Register kyberConsts = r10;
+ const Register limit = r11;
+
+ VSeq<4> vs1(0), vs2(4); // 4 sets of 8x8H inputs/outputs/tmps
+ VSeq<4> vs3(16), vs4(20);
+ VSeq<2> vq(30); // pair of constants for montmul: q, qinv
+ VSeq<2> vz(28); // pair of zetas
+ VSeq<4> vc(27, 0); // constant sequence for montmul: montRSquareModQ
+
+ __ lea(kyberConsts,
+ ExternalAddress((address) StubRoutines::aarch64::_kyberConsts));
+
+ Label kyberNttMult_loop;
+
+ __ add(limit, result, 512);
+
+ // load q and qinv
+ vs_ldpq(vq, kyberConsts);
+
+ // load R^2 mod q (to convert back from Montgomery representation)
+ __ add(kyberConsts, kyberConsts, 64);
+ __ ldr(v27, __ Q, kyberConsts);
+
+ __ BIND(kyberNttMult_loop);
+
+ // load 16 zetas
+ vs_ldpq_post(vz, zetas);
+
+ // load 2 sets of 32 coefficients from the two input arrays
+ // interleaved as shorts. i.e. pairs of shorts adjacent in memory
+ // are striped across pairs of vector registers
+ vs_ld2_post(vs_front(vs1), __ T8H, ntta); // x 8H
+ vs_ld2_post(vs_back(vs1), __ T8H, nttb); // x 8H
+ vs_ld2_post(vs_front(vs4), __ T8H, ntta); // x 8H
+ vs_ld2_post(vs_back(vs4), __ T8H, nttb); // x 8H
+
+ // compute 4 montmul cross-products for pairs (a0,a1) and (b0,b1)
+ // i.e. montmul the first and second halves of vs1 in order and
+ // then with one sequence reversed storing the two results in vs3
+ //
+ // vs3[0] <- montmul(a0, b0)
+ // vs3[1] <- montmul(a1, b1)
+ // vs3[2] <- montmul(a0, b1)
+ // vs3[3] <- montmul(a1, b0)
+ kyber_montmul16(vs_front(vs3), vs_front(vs1), vs_back(vs1), vs_front(vs2), vq);
+ kyber_montmul16(vs_back(vs3),
+ vs_front(vs1), vs_reverse(vs_back(vs1)), vs_back(vs2), vq);
+
+ // compute 4 montmul cross-products for pairs (a2,a3) and (b2,b3)
+ // i.e. montmul the first and second halves of vs4 in order and
+ // then with one sequence reversed storing the two results in vs1
+ //
+ // vs1[0] <- montmul(a2, b2)
+ // vs1[1] <- montmul(a3, b3)
+ // vs1[2] <- montmul(a2, b3)
+ // vs1[3] <- montmul(a3, b2)
+ kyber_montmul16(vs_front(vs1), vs_front(vs4), vs_back(vs4), vs_front(vs2), vq);
+ kyber_montmul16(vs_back(vs1),
+ vs_front(vs4), vs_reverse(vs_back(vs4)), vs_back(vs2), vq);
+
+ // montmul result 2 of each cross-product i.e. (a1*b1, a3*b3) by a zeta.
+ // We can schedule two montmuls at a time if we use a suitable vector
+ // sequence .
+ int delta = vs1[1]->encoding() - vs3[1]->encoding();
+ VSeq<2> vs5(vs3[1], delta);
+
+ // vs3[1] <- montmul(montmul(a1, b1), z0)
+ // vs1[1] <- montmul(montmul(a3, b3), z1)
+ kyber_montmul16(vs5, vz, vs5, vs_front(vs2), vq);
+
+ // add results in pairs storing in vs3
+ // vs3[0] <- montmul(a0, b0) + montmul(montmul(a1, b1), z0);
+ // vs3[1] <- montmul(a0, b1) + montmul(a1, b0);
+ vs_addv(vs_front(vs3), __ T8H, vs_even(vs3), vs_odd(vs3));
+
+ // vs3[2] <- montmul(a2, b2) + montmul(montmul(a3, b3), z1);
+ // vs3[3] <- montmul(a2, b3) + montmul(a3, b2);
+ vs_addv(vs_back(vs3), __ T8H, vs_even(vs1), vs_odd(vs1));
+
+ // vs1 <- montmul(vs3, montRSquareModQ)
+ kyber_montmul32(vs1, vs3, vc, vs2, vq);
+
+ // store back the two pairs of result vectors de-interleaved as 8H elements
+ // i.e. storing each pairs of shorts striped across a register pair adjacent
+ // in memory
+ vs_st2_post(vs1, __ T8H, result);
+
+ __ cmp(result, limit);
+ __ br(Assembler::NE, kyberNttMult_loop);
+
+ __ leave(); // required for proper stackwalking of RuntimeStub frame
+ __ mov(r0, zr); // return 0
+ __ ret(lr);
+
+ return start;
+ }
+
+ // Kyber add 2 polynomials.
+ // Implements
+ // static int implKyberAddPoly(short[] result, short[] a, short[] b) {}
+ //
+ // result (short[256]) = c_rarg0
+ // a (short[256]) = c_rarg1
+ // b (short[256]) = c_rarg2
+ address generate_kyberAddPoly_2() {
+
+ __ align(CodeEntryAlignment);
+ StubGenStubId stub_id = StubGenStubId::kyberAddPoly_2_id;
+ StubCodeMark mark(this, stub_id);
+ address start = __ pc();
+ __ enter();
+
+ const Register result = c_rarg0;
+ const Register a = c_rarg1;
+ const Register b = c_rarg2;
+
+ const Register kyberConsts = r11;
+
+ // We sum 256 sets of values in total i.e. 32 x 8H quadwords.
+ // So, we can load, add and store the data in 3 groups of 11,
+ // 11 and 10 at a time i.e. we need to map sets of 10 or 11
+ // registers. A further constraint is that the mapping needs
+ // to skip callee saves. So, we allocate the register
+ // sequences using two 8 sequences, two 2 sequences and two
+ // single registers.
+ VSeq<8> vs1_1(0);
+ VSeq<2> vs1_2(16);
+ FloatRegister vs1_3 = v28;
+ VSeq<8> vs2_1(18);
+ VSeq<2> vs2_2(26);
+ FloatRegister vs2_3 = v29;
+
+ // two constant vector sequences
+ VSeq<8> vc_1(31, 0);
+ VSeq<2> vc_2(31, 0);
+
+ FloatRegister vc_3 = v31;
+ __ lea(kyberConsts,
+ ExternalAddress((address) StubRoutines::aarch64::_kyberConsts));
+
+ __ ldr(vc_3, __ Q, Address(kyberConsts, 16)); // q
+ for (int i = 0; i < 3; i++) {
+ // load 80 or 88 values from a into vs1_1/2/3
+ vs_ldpq_post(vs1_1, a);
+ vs_ldpq_post(vs1_2, a);
+ if (i < 2) {
+ __ ldr(vs1_3, __ Q, __ post(a, 16));
+ }
+ // load 80 or 88 values from b into vs2_1/2/3
+ vs_ldpq_post(vs2_1, b);
+ vs_ldpq_post(vs2_2, b);
+ if (i < 2) {
+ __ ldr(vs2_3, __ Q, __ post(b, 16));
+ }
+ // sum 80 or 88 values across vs1 and vs2 into vs1
+ vs_addv(vs1_1, __ T8H, vs1_1, vs2_1);
+ vs_addv(vs1_2, __ T8H, vs1_2, vs2_2);
+ if (i < 2) {
+ __ addv(vs1_3, __ T8H, vs1_3, vs2_3);
+ }
+ // add constant to all 80 or 88 results
+ vs_addv(vs1_1, __ T8H, vs1_1, vc_1);
+ vs_addv(vs1_2, __ T8H, vs1_2, vc_2);
+ if (i < 2) {
+ __ addv(vs1_3, __ T8H, vs1_3, vc_3);
+ }
+ // store 80 or 88 values
+ vs_stpq_post(vs1_1, result);
+ vs_stpq_post(vs1_2, result);
+ if (i < 2) {
+ __ str(vs1_3, __ Q, __ post(result, 16));
+ }
+ }
+
+ __ leave(); // required for proper stackwalking of RuntimeStub frame
+ __ mov(r0, zr); // return 0
+ __ ret(lr);
+
+ return start;
+ }
+
+ // Kyber add 3 polynomials.
+ // Implements
+ // static int implKyberAddPoly(short[] result, short[] a, short[] b, short[] c) {}
+ //
+ // result (short[256]) = c_rarg0
+ // a (short[256]) = c_rarg1
+ // b (short[256]) = c_rarg2
+ // c (short[256]) = c_rarg3
+ address generate_kyberAddPoly_3() {
+
+ __ align(CodeEntryAlignment);
+ StubGenStubId stub_id = StubGenStubId::kyberAddPoly_3_id;
+ StubCodeMark mark(this, stub_id);
+ address start = __ pc();
+ __ enter();
+
+ const Register result = c_rarg0;
+ const Register a = c_rarg1;
+ const Register b = c_rarg2;
+ const Register c = c_rarg3;
+
+ const Register kyberConsts = r11;
+
+ // As above we sum 256 sets of values in total i.e. 32 x 8H
+ // quadwords. So, we can load, add and store the data in 3
+ // groups of 11, 11 and 10 at a time i.e. we need to map sets
+ // of 10 or 11 registers. A further constraint is that the
+ // mapping needs to skip callee saves. So, we allocate the
+ // register sequences using two 8 sequences, two 2 sequences
+ // and two single registers.
+ VSeq<8> vs1_1(0);
+ VSeq<2> vs1_2(16);
+ FloatRegister vs1_3 = v28;
+ VSeq<8> vs2_1(18);
+ VSeq<2> vs2_2(26);
+ FloatRegister vs2_3 = v29;
+
+ // two constant vector sequences
+ VSeq<8> vc_1(31, 0);
+ VSeq<2> vc_2(31, 0);
+
+ FloatRegister vc_3 = v31;
+
+ __ lea(kyberConsts,
+ ExternalAddress((address) StubRoutines::aarch64::_kyberConsts));
+
+ __ ldr(vc_3, __ Q, Address(kyberConsts, 16)); // q
+ for (int i = 0; i < 3; i++) {
+ // load 80 or 88 values from a into vs1_1/2/3
+ vs_ldpq_post(vs1_1, a);
+ vs_ldpq_post(vs1_2, a);
+ if (i < 2) {
+ __ ldr(vs1_3, __ Q, __ post(a, 16));
+ }
+ // load 80 or 88 values from b into vs2_1/2/3
+ vs_ldpq_post(vs2_1, b);
+ vs_ldpq_post(vs2_2, b);
+ if (i < 2) {
+ __ ldr(vs2_3, __ Q, __ post(b, 16));
+ }
+ // sum 80 or 88 values across vs1 and vs2 into vs1
+ vs_addv(vs1_1, __ T8H, vs1_1, vs2_1);
+ vs_addv(vs1_2, __ T8H, vs1_2, vs2_2);
+ if (i < 2) {
+ __ addv(vs1_3, __ T8H, vs1_3, vs2_3);
+ }
+ // load 80 or 88 values from c into vs2_1/2/3
+ vs_ldpq_post(vs2_1, c);
+ vs_ldpq_post(vs2_2, c);
+ if (i < 2) {
+ __ ldr(vs2_3, __ Q, __ post(c, 16));
+ }
+ // sum 80 or 88 values across vs1 and vs2 into vs1
+ vs_addv(vs1_1, __ T8H, vs1_1, vs2_1);
+ vs_addv(vs1_2, __ T8H, vs1_2, vs2_2);
+ if (i < 2) {
+ __ addv(vs1_3, __ T8H, vs1_3, vs2_3);
+ }
+ // add constant to all 80 or 88 results
+ vs_addv(vs1_1, __ T8H, vs1_1, vc_1);
+ vs_addv(vs1_2, __ T8H, vs1_2, vc_2);
+ if (i < 2) {
+ __ addv(vs1_3, __ T8H, vs1_3, vc_3);
+ }
+ // store 80 or 88 values
+ vs_stpq_post(vs1_1, result);
+ vs_stpq_post(vs1_2, result);
+ if (i < 2) {
+ __ str(vs1_3, __ Q, __ post(result, 16));
+ }
+ }
+
+ __ leave(); // required for proper stackwalking of RuntimeStub frame
+ __ mov(r0, zr); // return 0
+ __ ret(lr);
+
+ return start;
}
- // perform combined montmul then add/sub on 4x4S vectors
+ // Kyber parse XOF output to polynomial coefficient candidates
+ // or decodePoly(12, ...).
+ // Implements
+ // static int implKyber12To16(
+ // byte[] condensed, int index, short[] parsed, int parsedLength) {}
+ //
+ // (parsedLength or (parsedLength - 48) must be divisible by 64.)
+ //
+ // condensed (byte[]) = c_rarg0
+ // condensedIndex = c_rarg1
+ // parsed (short[112 or 256]) = c_rarg2
+ // parsedLength (112 or 256) = c_rarg3
+ address generate_kyber12To16() {
+ Label L_F00, L_loop, L_end;
+
+ __ BIND(L_F00);
+ __ emit_int64(0x0f000f000f000f00);
+ __ emit_int64(0x0f000f000f000f00);
+
+ __ align(CodeEntryAlignment);
+ StubGenStubId stub_id = StubGenStubId::kyber12To16_id;
+ StubCodeMark mark(this, stub_id);
+ address start = __ pc();
+ __ enter();
+
+ const Register condensed = c_rarg0;
+ const Register condensedOffs = c_rarg1;
+ const Register parsed = c_rarg2;
+ const Register parsedLength = c_rarg3;
+
+ const Register tmpAddr = r11;
+
+ // Data is input 96 bytes at a time i.e. in groups of 6 x 16B
+ // quadwords so we need a 6 vector sequence for the inputs.
+ // Parsing produces 64 shorts, employing two 8 vector
+ // sequences to store and combine the intermediate data.
+ VSeq<6> vin(24);
+ VSeq<8> va(0), vb(16);
+
+ __ adr(tmpAddr, L_F00);
+ __ ldr(v31, __ Q, tmpAddr); // 8H times 0x0f00
+ __ add(condensed, condensed, condensedOffs);
+
+ __ BIND(L_loop);
+ // load 96 (6 x 16B) byte values
+ vs_ld3_post(vin, __ T16B, condensed);
+
+ // The front half of sequence vin (vin[0], vin[1] and vin[2])
+ // holds 48 (16x3) contiguous bytes from memory striped
+ // horizontally across each of the 16 byte lanes. Equivalently,
+ // that is 16 pairs of 12-bit integers. Likewise the back half
+ // holds the next 48 bytes in the same arrangement.
+
+ // Each vector in the front half can also be viewed as a vertical
+ // strip across the 16 pairs of 12 bit integers. Each byte in
+ // vin[0] stores the low 8 bits of the first int in a pair. Each
+ // byte in vin[1] stores the high 4 bits of the first int and the
+ // low 4 bits of the second int. Each byte in vin[2] stores the
+ // high 8 bits of the second int. Likewise the vectors in second
+ // half.
+
+ // Converting the data to 16-bit shorts requires first of all
+ // expanding each of the 6 x 16B vectors into 6 corresponding
+ // pairs of 8H vectors. Mask, shift and add operations on the
+ // resulting vector pairs can be used to combine 4 and 8 bit
+ // parts of related 8H vector elements.
+ //
+ // The middle vectors (vin[2] and vin[5]) are actually expanded
+ // twice, one copy manipulated to provide the lower 4 bits
+ // belonging to the first short in a pair and another copy
+ // manipulated to provide the higher 4 bits belonging to the
+ // second short in a pair. This is why the the vector sequences va
+ // and vb used to hold the expanded 8H elements are of length 8.
+
+ // Expand vin[0] into va[0:1], and vin[1] into va[2:3] and va[4:5]
+ // n.b. target elements 2 and 3 duplicate elements 4 and 5
+ __ ushll(va[0], __ T8H, vin[0], __ T8B, 0);
+ __ ushll2(va[1], __ T8H, vin[0], __ T16B, 0);
+ __ ushll(va[2], __ T8H, vin[1], __ T8B, 0);
+ __ ushll2(va[3], __ T8H, vin[1], __ T16B, 0);
+ __ ushll(va[4], __ T8H, vin[1], __ T8B, 0);
+ __ ushll2(va[5], __ T8H, vin[1], __ T16B, 0);
+
+ // likewise expand vin[3] into vb[0:1], and vin[4] into vb[2:3]
+ // and vb[4:5]
+ __ ushll(vb[0], __ T8H, vin[3], __ T8B, 0);
+ __ ushll2(vb[1], __ T8H, vin[3], __ T16B, 0);
+ __ ushll(vb[2], __ T8H, vin[4], __ T8B, 0);
+ __ ushll2(vb[3], __ T8H, vin[4], __ T16B, 0);
+ __ ushll(vb[4], __ T8H, vin[4], __ T8B, 0);
+ __ ushll2(vb[5], __ T8H, vin[4], __ T16B, 0);
+
+ // shift lo byte of copy 1 of the middle stripe into the high byte
+ __ shl(va[2], __ T8H, va[2], 8);
+ __ shl(va[3], __ T8H, va[3], 8);
+ __ shl(vb[2], __ T8H, vb[2], 8);
+ __ shl(vb[3], __ T8H, vb[3], 8);
+
+ // expand vin[2] into va[6:7] and vin[5] into vb[6:7] but this
+ // time pre-shifted by 4 to ensure top bits of input 12-bit int
+ // are in bit positions [4..11].
+ __ ushll(va[6], __ T8H, vin[2], __ T8B, 4);
+ __ ushll2(va[7], __ T8H, vin[2], __ T16B, 4);
+ __ ushll(vb[6], __ T8H, vin[5], __ T8B, 4);
+ __ ushll2(vb[7], __ T8H, vin[5], __ T16B, 4);
+
+ // mask hi 4 bits of the 1st 12-bit int in a pair from copy1 and
+ // shift lo 4 bits of the 2nd 12-bit int in a pair to the bottom of
+ // copy2
+ __ andr(va[2], __ T16B, va[2], v31);
+ __ andr(va[3], __ T16B, va[3], v31);
+ __ ushr(va[4], __ T8H, va[4], 4);
+ __ ushr(va[5], __ T8H, va[5], 4);
+ __ andr(vb[2], __ T16B, vb[2], v31);
+ __ andr(vb[3], __ T16B, vb[3], v31);
+ __ ushr(vb[4], __ T8H, vb[4], 4);
+ __ ushr(vb[5], __ T8H, vb[5], 4);
+
+ // sum hi 4 bits and lo 8 bits of the 1st 12-bit int in each pair and
+ // hi 8 bits plus lo 4 bits of the 2nd 12-bit int in each pair
+ // n.b. the ordering ensures: i) inputs are consumed before they
+ // are overwritten ii) the order of 16-bit results across successive
+ // pairs of vectors in va and then vb reflects the order of the
+ // corresponding 12-bit inputs
+ __ addv(va[0], __ T8H, va[0], va[2]);
+ __ addv(va[2], __ T8H, va[1], va[3]);
+ __ addv(va[1], __ T8H, va[4], va[6]);
+ __ addv(va[3], __ T8H, va[5], va[7]);
+ __ addv(vb[0], __ T8H, vb[0], vb[2]);
+ __ addv(vb[2], __ T8H, vb[1], vb[3]);
+ __ addv(vb[1], __ T8H, vb[4], vb[6]);
+ __ addv(vb[3], __ T8H, vb[5], vb[7]);
+
+ // store 64 results interleaved as shorts
+ vs_st2_post(vs_front(va), __ T8H, parsed);
+ vs_st2_post(vs_front(vb), __ T8H, parsed);
+
+ __ sub(parsedLength, parsedLength, 64);
+ __ cmp(parsedLength, (u1)64);
+ __ br(Assembler::GE, L_loop);
+ __ cbz(parsedLength, L_end);
+
+ // if anything is left it should be a final 72 bytes of input
+ // i.e. a final 48 12-bit values. so we handle this by loading
+ // 48 bytes into all 16B lanes of front(vin) and only 24
+ // bytes into the lower 8B lane of back(vin)
+ vs_ld3_post(vs_front(vin), __ T16B, condensed);
+ vs_ld3(vs_back(vin), __ T8B, condensed);
+
+ // Expand vin[0] into va[0:1], and vin[1] into va[2:3] and va[4:5]
+ // n.b. target elements 2 and 3 of va duplicate elements 4 and
+ // 5 and target element 2 of vb duplicates element 4.
+ __ ushll(va[0], __ T8H, vin[0], __ T8B, 0);
+ __ ushll2(va[1], __ T8H, vin[0], __ T16B, 0);
+ __ ushll(va[2], __ T8H, vin[1], __ T8B, 0);
+ __ ushll2(va[3], __ T8H, vin[1], __ T16B, 0);
+ __ ushll(va[4], __ T8H, vin[1], __ T8B, 0);
+ __ ushll2(va[5], __ T8H, vin[1], __ T16B, 0);
+
+ // This time expand just the lower 8 lanes
+ __ ushll(vb[0], __ T8H, vin[3], __ T8B, 0);
+ __ ushll(vb[2], __ T8H, vin[4], __ T8B, 0);
+ __ ushll(vb[4], __ T8H, vin[4], __ T8B, 0);
+
+ // shift lo byte of copy 1 of the middle stripe into the high byte
+ __ shl(va[2], __ T8H, va[2], 8);
+ __ shl(va[3], __ T8H, va[3], 8);
+ __ shl(vb[2], __ T8H, vb[2], 8);
+
+ // expand vin[2] into va[6:7] and lower 8 lanes of vin[5] into
+ // vb[6] pre-shifted by 4 to ensure top bits of the input 12-bit
+ // int are in bit positions [4..11].
+ __ ushll(va[6], __ T8H, vin[2], __ T8B, 4);
+ __ ushll2(va[7], __ T8H, vin[2], __ T16B, 4);
+ __ ushll(vb[6], __ T8H, vin[5], __ T8B, 4);
+
+ // mask hi 4 bits of each 1st 12-bit int in pair from copy1 and
+ // shift lo 4 bits of each 2nd 12-bit int in pair to bottom of
+ // copy2
+ __ andr(va[2], __ T16B, va[2], v31);
+ __ andr(va[3], __ T16B, va[3], v31);
+ __ ushr(va[4], __ T8H, va[4], 4);
+ __ ushr(va[5], __ T8H, va[5], 4);
+ __ andr(vb[2], __ T16B, vb[2], v31);
+ __ ushr(vb[4], __ T8H, vb[4], 4);
+
+
+
+ // sum hi 4 bits and lo 8 bits of each 1st 12-bit int in pair and
+ // hi 8 bits plus lo 4 bits of each 2nd 12-bit int in pair
+
+ // n.b. ordering ensures: i) inputs are consumed before they are
+ // overwritten ii) order of 16-bit results across succsessive
+ // pairs of vectors in va and then lower half of vb reflects order
+ // of corresponding 12-bit inputs
+ __ addv(va[0], __ T8H, va[0], va[2]);
+ __ addv(va[2], __ T8H, va[1], va[3]);
+ __ addv(va[1], __ T8H, va[4], va[6]);
+ __ addv(va[3], __ T8H, va[5], va[7]);
+ __ addv(vb[0], __ T8H, vb[0], vb[2]);
+ __ addv(vb[1], __ T8H, vb[4], vb[6]);
+
+ // store 48 results interleaved as shorts
+ vs_st2_post(vs_front(va), __ T8H, parsed);
+ vs_st2_post(vs_front(vs_front(vb)), __ T8H, parsed);
+
+ __ BIND(L_end);
+
+ __ leave(); // required for proper stackwalking of RuntimeStub frame
+ __ mov(r0, zr); // return 0
+ __ ret(lr);
+
+ return start;
+ }
+
+ // Kyber Barrett reduce function.
+ // Implements
+ // static int implKyberBarrettReduce(short[] coeffs) {}
+ //
+ // coeffs (short[256]) = c_rarg0
+ address generate_kyberBarrettReduce() {
+
+ __ align(CodeEntryAlignment);
+ StubGenStubId stub_id = StubGenStubId::kyberBarrettReduce_id;
+ StubCodeMark mark(this, stub_id);
+ address start = __ pc();
+ __ enter();
+
+ const Register coeffs = c_rarg0;
+
+ const Register kyberConsts = r10;
+ const Register result = r11;
+
+ // As above we process 256 sets of values in total i.e. 32 x
+ // 8H quadwords. So, we can load, add and store the data in 3
+ // groups of 11, 11 and 10 at a time i.e. we need to map sets
+ // of 10 or 11 registers. A further constraint is that the
+ // mapping needs to skip callee saves. So, we allocate the
+ // register sequences using two 8 sequences, two 2 sequences
+ // and two single registers.
+ VSeq<8> vs1_1(0);
+ VSeq<2> vs1_2(16);
+ FloatRegister vs1_3 = v28;
+ VSeq<8> vs2_1(18);
+ VSeq<2> vs2_2(26);
+ FloatRegister vs2_3 = v29;
+
+ // we also need a pair of corresponding constant sequences
+
+ VSeq<8> vc1_1(30, 0);
+ VSeq<2> vc1_2(30, 0);
+ FloatRegister vc1_3 = v30; // for kyber_q
+
+ VSeq<8> vc2_1(31, 0);
+ VSeq<2> vc2_2(31, 0);
+ FloatRegister vc2_3 = v31; // for kyberBarrettMultiplier
- void dilithium_montmul16_sub_add(const VSeq<4>& va0, const VSeq<4>& va1, const VSeq<4>& vc,
- const VSeq<4>& vtmp, const VSeq<2>& vq) {
+ __ add(result, coeffs, 0);
+ __ lea(kyberConsts,
+ ExternalAddress((address) StubRoutines::aarch64::_kyberConsts));
+
+ // load q and the multiplier for the Barrett reduction
+ __ add(kyberConsts, kyberConsts, 16);
+ __ ldpq(vc1_3, vc2_3, kyberConsts);
+
+ for (int i = 0; i < 3; i++) {
+ // load 80 or 88 coefficients
+ vs_ldpq_post(vs1_1, coeffs);
+ vs_ldpq_post(vs1_2, coeffs);
+ if (i < 2) {
+ __ ldr(vs1_3, __ Q, __ post(coeffs, 16));
+ }
+
+ // vs2 <- (2 * vs1 * kyberBarrettMultiplier) >> 16
+ vs_sqdmulh(vs2_1, __ T8H, vs1_1, vc2_1);
+ vs_sqdmulh(vs2_2, __ T8H, vs1_2, vc2_2);
+ if (i < 2) {
+ __ sqdmulh(vs2_3, __ T8H, vs1_3, vc2_3);
+ }
+
+ // vs2 <- (vs1 * kyberBarrettMultiplier) >> 26
+ vs_sshr(vs2_1, __ T8H, vs2_1, 11);
+ vs_sshr(vs2_2, __ T8H, vs2_2, 11);
+ if (i < 2) {
+ __ sshr(vs2_3, __ T8H, vs2_3, 11);
+ }
+
+ // vs1 <- vs1 - vs2 * kyber_q
+ vs_mlsv(vs1_1, __ T8H, vs2_1, vc1_1);
+ vs_mlsv(vs1_2, __ T8H, vs2_2, vc1_2);
+ if (i < 2) {
+ __ mlsv(vs1_3, __ T8H, vs2_3, vc1_3);
+ }
+
+ vs_stpq_post(vs1_1, result);
+ vs_stpq_post(vs1_2, result);
+ if (i < 2) {
+ __ str(vs1_3, __ Q, __ post(result, 16));
+ }
+ }
+
+ __ leave(); // required for proper stackwalking of RuntimeStub frame
+ __ mov(r0, zr); // return 0
+ __ ret(lr);
+
+ return start;
+ }
+
+
+ // Dilithium-specific montmul helper routines that generate parallel
+ // code for, respectively, a single 4x4s vector sequence montmul or
+ // two such multiplies in a row.
+
+ // Perform 16 32-bit Montgomery multiplications in parallel
+ void dilithium_montmul16(const VSeq<4>& va, const VSeq<4>& vb, const VSeq<4>& vc,
+ const VSeq<4>& vtmp, const VSeq<2>& vq) {
+ // Use the helper routine to schedule a 4x4S Montgomery multiply.
+ // It will assert that the register use is valid
+ vs_montmul4(va, vb, vc, __ T4S, vtmp, vq);
+ }
+
+ // Perform 2x16 32-bit Montgomery multiplications in parallel
+ void dilithium_montmul32(const VSeq<8>& va, const VSeq<8>& vb, const VSeq<8>& vc,
+ const VSeq<4>& vtmp, const VSeq<2>& vq) {
+ // Schedule two successive 4x4S multiplies via the montmul helper
+ // on the front and back halves of va, vb and vc. The helper will
+ // assert that the register use has no overlap conflicts on each
+ // individual call but we also need to ensure that the necessary
+ // disjoint/equality constraints are met across both calls.
+
+ // vb, vc, vtmp and vq must be disjoint. va must either be
+ // disjoint from all other registers or equal vc
+
+ assert(vs_disjoint(vb, vc), "vb and vc overlap");
+ assert(vs_disjoint(vb, vq), "vb and vq overlap");
+ assert(vs_disjoint(vb, vtmp), "vb and vtmp overlap");
+
+ assert(vs_disjoint(vc, vq), "vc and vq overlap");
+ assert(vs_disjoint(vc, vtmp), "vc and vtmp overlap");
+
+ assert(vs_disjoint(vq, vtmp), "vq and vtmp overlap");
+
+ assert(vs_disjoint(va, vc) || vs_same(va, vc), "va and vc neither disjoint nor equal");
+ assert(vs_disjoint(va, vb), "va and vb overlap");
+ assert(vs_disjoint(va, vq), "va and vq overlap");
+ assert(vs_disjoint(va, vtmp), "va and vtmp overlap");
+
+ // We multiply the front and back halves of each sequence 4 at a
+ // time because
+ //
+ // 1) we are currently only able to get 4-way instruction
+ // parallelism at best
+ //
+ // 2) we need registers for the constants in vq and temporary
+ // scratch registers to hold intermediate results so vtmp can only
+ // be a VSeq<4> which means we only have 4 scratch slots.
+
+ vs_montmul4(vs_front(va), vs_front(vb), vs_front(vc), __ T4S, vtmp, vq);
+ vs_montmul4(vs_back(va), vs_back(vb), vs_back(vc), __ T4S, vtmp, vq);
+ }
+
+ // Perform combined montmul then add/sub on 4x4S vectors.
+ void dilithium_montmul16_sub_add(
+ const VSeq<4>& va0, const VSeq<4>& va1, const VSeq<4>& vc,
+ const VSeq<4>& vtmp, const VSeq<2>& vq) {
// compute a = montmul(a1, c)
dilithium_montmul16(vc, va1, vc, vtmp, vq);
// ouptut a1 = a0 - a
@@ -4917,10 +6271,10 @@ class StubGenerator: public StubCodeGenerator {
vs_addv(va0, __ T4S, va0, vc);
}
- // perform combined add/sub then montul on 4x4S vectors
-
- void dilithium_sub_add_montmul16(const VSeq<4>& va0, const VSeq<4>& va1, const VSeq<4>& vb,
- const VSeq<4>& vtmp1, const VSeq<4>& vtmp2, const VSeq<2>& vq) {
+ // Perform combined add/sub then montul on 4x4S vectors.
+ void dilithium_sub_add_montmul16(
+ const VSeq<4>& va0, const VSeq<4>& va1, const VSeq<4>& vb,
+ const VSeq<4>& vtmp1, const VSeq<4>& vtmp2, const VSeq<2>& vq) {
// compute c = a0 - a1
vs_subv(vtmp1, __ T4S, va0, va1);
// output a0 = a0 + a1
@@ -4963,10 +6317,10 @@ class StubGenerator: public StubCodeGenerator {
offsets[3] = 192;
}
- // for levels 1 - 4 we simply load 2 x 4 adjacent values at a
+ // For levels 1 - 4 we simply load 2 x 4 adjacent values at a
// time at 4 different offsets and multiply them in order by the
// next set of input values. So we employ indexed load and store
- // pair instructions with arrangement 4S
+ // pair instructions with arrangement 4S.
for (int i = 0; i < 4; i++) {
// reload q and qinv
vs_ldpq(vq, dilithiumConsts); // qInv, q
@@ -4975,7 +6329,7 @@ class StubGenerator: public StubCodeGenerator {
// load next 8x4S inputs == b
vs_ldpq_post(vs2, zetas);
// compute a == c2 * b mod MONT_Q
- vs_montmul32(vs2, vs1, vs2, vtmp, vq);
+ dilithium_montmul32(vs2, vs1, vs2, vtmp, vq);
// load 8x4s coefficients via first start pos == c1
vs_ldpq_indexed(vs1, coeffs, c1Start, offsets);
// compute a1 = c1 + a
@@ -5029,20 +6383,21 @@ class StubGenerator: public StubCodeGenerator {
VSeq<8> vs1(0), vs2(16), vs3(24); // 3 sets of 8x4s inputs/outputs
VSeq<4> vtmp = vs_front(vs3); // n.b. tmp registers overlap vs3
VSeq<2> vq(30); // n.b. constants overlap vs3
- int offsets[4] = {0, 32, 64, 96};
- int offsets1[8] = {16, 48, 80, 112, 144, 176, 208, 240 };
+ int offsets[4] = { 0, 32, 64, 96};
+ int offsets1[8] = { 16, 48, 80, 112, 144, 176, 208, 240 };
int offsets2[8] = { 0, 32, 64, 96, 128, 160, 192, 224 };
__ add(result, coeffs, 0);
- __ lea(dilithiumConsts, ExternalAddress((address) StubRoutines::aarch64::_dilithiumConsts));
+ __ lea(dilithiumConsts,
+ ExternalAddress((address) StubRoutines::aarch64::_dilithiumConsts));
- // Each level represents one iteration of the outer for loop of the Java version
+ // Each level represents one iteration of the outer for loop of the Java version.
// level 0-4
dilithiumNttLevel0_4(dilithiumConsts, coeffs, zetas);
// level 5
- // at level 5 the coefficients we need to combine with the zetas
+ // At level 5 the coefficients we need to combine with the zetas
// are grouped in memory in blocks of size 4. So, for both sets of
// coefficients we load 4 adjacent values at 8 different offsets
// using an indexed ldr with register variant Q and multiply them
@@ -5056,7 +6411,7 @@ class StubGenerator: public StubCodeGenerator {
// load next 32 (8x4S) inputs = b
vs_ldpq_post(vs2, zetas);
// a = b montul c1
- vs_montmul32(vs2, vs1, vs2, vtmp, vq);
+ dilithium_montmul32(vs2, vs1, vs2, vtmp, vq);
// load 32 (8x4S) coefficients via second offsets = c2
vs_ldr_indexed(vs1, __ Q, coeffs, i, offsets2);
// add/sub with result of multiply
@@ -5068,7 +6423,7 @@ class StubGenerator: public StubCodeGenerator {
}
// level 6
- // at level 6 the coefficients we need to combine with the zetas
+ // At level 6 the coefficients we need to combine with the zetas
// are grouped in memory in pairs, the first two being montmul
// inputs and the second add/sub inputs. We can still implement
// the montmul+sub+add using 4-way parallelism but only if we
@@ -5096,7 +6451,7 @@ class StubGenerator: public StubCodeGenerator {
}
// level 7
- // at level 7 the coefficients we need to combine with the zetas
+ // At level 7 the coefficients we need to combine with the zetas
// occur singly with montmul inputs alterating with add/sub
// inputs. Once again we can use 4-way parallelism to combine 16
// zetas at a time. However, we have to load 8 adjacent values at
@@ -5168,10 +6523,10 @@ class StubGenerator: public StubCodeGenerator {
offsets[3] = 96;
}
- // for levels 3 - 7 we simply load 2 x 4 adjacent values at a
+ // For levels 3 - 7 we simply load 2 x 4 adjacent values at a
// time at 4 different offsets and multiply them in order by the
// next set of input values. So we employ indexed load and store
- // pair instructions with arrangement 4S
+ // pair instructions with arrangement 4S.
for (int i = 0; i < 4; i++) {
// load v1 32 (8x4S) coefficients relative to first start index
vs_ldpq_indexed(vs1, coeffs, c1Start, offsets);
@@ -5188,7 +6543,7 @@ class StubGenerator: public StubCodeGenerator {
// load b next 32 (8x4S) inputs
vs_ldpq_post(vs2, zetas);
// a = a1 montmul b
- vs_montmul32(vs2, vs1, vs2, vtmp, vq);
+ dilithium_montmul32(vs2, vs1, vs2, vtmp, vq);
// save a relative to second start index
vs_stpq_indexed(vs2, coeffs, c2Start, offsets);
@@ -5239,16 +6594,16 @@ class StubGenerator: public StubCodeGenerator {
int offsets2[8] = { 16, 48, 80, 112, 144, 176, 208, 240 };
__ add(result, coeffs, 0);
- __ lea(dilithiumConsts, ExternalAddress((address) StubRoutines::aarch64::_dilithiumConsts));
+ __ lea(dilithiumConsts,
+ ExternalAddress((address) StubRoutines::aarch64::_dilithiumConsts));
// Each level represents one iteration of the outer for loop of the Java version
- // level0
// level 0
// At level 0 we need to interleave adjacent quartets of
// coefficients before we multiply and add/sub by the next 16
// zetas just as we did for level 7 in the multiply code. So we
- // load and store the values using an ld2/st2 with arrangement 4S
+ // load and store the values using an ld2/st2 with arrangement 4S.
for (int i = 0; i < 1024; i += 128) {
// load constants q, qinv
// n.b. this can be moved out of the loop as they do not get
@@ -5270,7 +6625,7 @@ class StubGenerator: public StubCodeGenerator {
// At level 1 we need to interleave pairs of adjacent pairs of
// coefficients before we multiply by the next 16 zetas just as we
// did for level 6 in the multiply code. So we load and store the
- // values an ld2/st2 with arrangement 2D
+ // values an ld2/st2 with arrangement 2D.
for (int i = 0; i < 1024; i += 128) {
// a0/a1 load interleaved 32 (8x2D) coefficients
vs_ld2_indexed(vs1, __ T2D, coeffs, tmpAddr, i, offsets);
@@ -5306,7 +6661,7 @@ class StubGenerator: public StubCodeGenerator {
// reload constants q, qinv -- they were clobbered earlier
vs_ldpq(vq, dilithiumConsts); // qInv, q
// compute a1 = b montmul c
- vs_montmul32(vs2, vs1, vs2, vtmp, vq);
+ dilithium_montmul32(vs2, vs1, vs2, vtmp, vq);
// store a1 32 (8x4S) coefficients via second offsets
vs_str_indexed(vs2, __ Q, coeffs, i, offsets2);
}
@@ -5319,7 +6674,6 @@ class StubGenerator: public StubCodeGenerator {
__ ret(lr);
return start;
-
}
// Dilithium multiply polynomials in the NTT domain.
@@ -5353,7 +6707,8 @@ class StubGenerator: public StubCodeGenerator {
VSeq<2> vq(30); // n.b. constants overlap vs3
VSeq<8> vrsquare(29, 0); // for montmul by constant RSQUARE
- __ lea(dilithiumConsts, ExternalAddress((address) StubRoutines::aarch64::_dilithiumConsts));
+ __ lea(dilithiumConsts,
+ ExternalAddress((address) StubRoutines::aarch64::_dilithiumConsts));
// load constants q, qinv
vs_ldpq(vq, dilithiumConsts); // qInv, q
@@ -5370,9 +6725,9 @@ class StubGenerator: public StubCodeGenerator {
// c load 32 (8x4S) next inputs from poly2
vs_ldpq_post(vs2, poly2);
// compute a = b montmul c
- vs_montmul32(vs2, vs1, vs2, vtmp, vq);
+ dilithium_montmul32(vs2, vs1, vs2, vtmp, vq);
// compute a = rsquare montmul a
- vs_montmul32(vs2, vrsquare, vs2, vtmp, vq);
+ dilithium_montmul32(vs2, vrsquare, vs2, vtmp, vq);
// save a 32 (8x4S) results
vs_stpq_post(vs2, result);
@@ -5385,7 +6740,6 @@ class StubGenerator: public StubCodeGenerator {
__ ret(lr);
return start;
-
}
// Dilithium Motgomery multiply an array by a constant.
@@ -5413,13 +6767,14 @@ class StubGenerator: public StubCodeGenerator {
const Register len = r12;
VSeq<8> vs1(0), vs2(16), vs3(24); // 3 sets of 8x4s inputs/outputs
- VSeq<4> vtmp = vs_front(vs3); // n.b. tmp registers overlap vs3
+ VSeq<4> vtmp = vs_front(vs3); // n.b. tmp registers overlap vs3
VSeq<2> vq(30); // n.b. constants overlap vs3
VSeq<8> vconst(29, 0); // for montmul by constant
// results track inputs
__ add(result, coeffs, 0);
- __ lea(dilithiumConsts, ExternalAddress((address) StubRoutines::aarch64::_dilithiumConsts));
+ __ lea(dilithiumConsts,
+ ExternalAddress((address) StubRoutines::aarch64::_dilithiumConsts));
// load constants q, qinv -- they do not get clobbered by first two loops
vs_ldpq(vq, dilithiumConsts); // qInv, q
@@ -5433,7 +6788,7 @@ class StubGenerator: public StubCodeGenerator {
// load next 32 inputs
vs_ldpq_post(vs2, coeffs);
// mont mul by constant
- vs_montmul32(vs2, vconst, vs2, vtmp, vq);
+ dilithium_montmul32(vs2, vconst, vs2, vtmp, vq);
// write next 32 results
vs_stpq_post(vs2, result);
@@ -5446,7 +6801,6 @@ class StubGenerator: public StubCodeGenerator {
__ ret(lr);
return start;
-
}
// Dilithium decompose poly.
@@ -5477,9 +6831,12 @@ class StubGenerator: public StubCodeGenerator {
const Register dilithiumConsts = r10;
const Register tmp = r11;
- VSeq<4> vs1(0), vs2(4), vs3(8); // 6 independent sets of 4x4s values
+ // 6 independent sets of 4x4s values
+ VSeq<4> vs1(0), vs2(4), vs3(8);
VSeq<4> vs4(12), vs5(16), vtmp(20);
- VSeq<4> one(25, 0); // 7 constants for cross-multiplying
+
+ // 7 constants for cross-multiplying
+ VSeq<4> one(25, 0);
VSeq<4> qminus1(26, 0);
VSeq<4> g2(27, 0);
VSeq<4> twog2(28, 0);
@@ -5489,7 +6846,8 @@ class StubGenerator: public StubCodeGenerator {
__ enter();
- __ lea(dilithiumConsts, ExternalAddress((address) StubRoutines::aarch64::_dilithiumConsts));
+ __ lea(dilithiumConsts,
+ ExternalAddress((address) StubRoutines::aarch64::_dilithiumConsts));
// save callee-saved registers
__ stpd(v8, v9, __ pre(sp, -64));
@@ -5586,7 +6944,6 @@ class StubGenerator: public StubCodeGenerator {
__ st4(vs3[0], vs3[1], vs3[2], vs3[3], __ T4S, __ post(lowPart, 64));
__ st4(vs1[0], vs1[1], vs1[2], vs1[3], __ T4S, __ post(highPart, 64));
-
__ sub(len, len, 64);
__ cmp(len, (u1)64);
__ br(Assembler::GE, L_loop);
@@ -5602,7 +6959,47 @@ class StubGenerator: public StubCodeGenerator {
__ ret(lr);
return start;
+ }
+ /**
+ * Arguments:
+ *
+ * Inputs:
+ * c_rarg0 - int crc
+ * c_rarg1 - byte* buf
+ * c_rarg2 - int length
+ *
+ * Output:
+ * rax - int crc result
+ */
+ address generate_updateBytesCRC32() {
+ assert(UseCRC32Intrinsics, "what are we doing here?");
+
+ __ align(CodeEntryAlignment);
+ StubGenStubId stub_id = StubGenStubId::updateBytesCRC32_id;
+ StubCodeMark mark(this, stub_id);
+
+ address start = __ pc();
+
+ const Register crc = c_rarg0; // crc
+ const Register buf = c_rarg1; // source java byte array address
+ const Register len = c_rarg2; // length
+ const Register table0 = c_rarg3; // crc_table address
+ const Register table1 = c_rarg4;
+ const Register table2 = c_rarg5;
+ const Register table3 = c_rarg6;
+ const Register tmp3 = c_rarg7;
+
+ BLOCK_COMMENT("Entry:");
+ __ enter(); // required for proper stackwalking of RuntimeStub frame
+
+ __ kernel_crc32(crc, buf, len,
+ table0, table1, table2, table3, rscratch1, rscratch2, tmp3);
+
+ __ leave(); // required for proper stackwalking of RuntimeStub frame
+ __ ret(lr);
+
+ return start;
}
/**
@@ -9769,79 +11166,6 @@ class StubGenerator: public StubCodeGenerator {
// }
};
- void generate_vector_math_stubs() {
- // Get native vector math stub routine addresses
- void* libsleef = nullptr;
- char ebuf[1024];
- char dll_name[JVM_MAXPATHLEN];
- if (os::dll_locate_lib(dll_name, sizeof(dll_name), Arguments::get_dll_dir(), "sleef")) {
- libsleef = os::dll_load(dll_name, ebuf, sizeof ebuf);
- }
- if (libsleef == nullptr) {
- log_info(library)("Failed to load native vector math library, %s!", ebuf);
- return;
- }
- // Method naming convention
- // All the methods are named as _
- // Where:
- // is the operation name, e.g. sin
- // is optional to indicate float/double
- // "f/d" for vector float/double operation
- // is the number of elements in the vector
- // "2/4" for neon, and "x" for sve
- // is the precision level
- // "u10/u05" represents 1.0/0.5 ULP error bounds
- // We use "u10" for all operations by default
- // But for those functions do not have u10 support, we use "u05" instead
- // indicates neon/sve
- // "sve/advsimd" for sve/neon implementations
- // e.g. sinfx_u10sve is the method for computing vector float sin using SVE instructions
- // cosd2_u10advsimd is the method for computing 2 elements vector double cos using NEON instructions
- //
- log_info(library)("Loaded library %s, handle " INTPTR_FORMAT, JNI_LIB_PREFIX "sleef" JNI_LIB_SUFFIX, p2i(libsleef));
-
- // Math vector stubs implemented with SVE for scalable vector size.
- if (UseSVE > 0) {
- for (int op = 0; op < VectorSupport::NUM_VECTOR_OP_MATH; op++) {
- int vop = VectorSupport::VECTOR_OP_MATH_START + op;
- // Skip "tanh" because there is performance regression
- if (vop == VectorSupport::VECTOR_OP_TANH) {
- continue;
- }
-
- // The native library does not support u10 level of "hypot".
- const char* ulf = (vop == VectorSupport::VECTOR_OP_HYPOT) ? "u05" : "u10";
-
- snprintf(ebuf, sizeof(ebuf), "%sfx_%ssve", VectorSupport::mathname[op], ulf);
- StubRoutines::_vector_f_math[VectorSupport::VEC_SIZE_SCALABLE][op] = (address)os::dll_lookup(libsleef, ebuf);
-
- snprintf(ebuf, sizeof(ebuf), "%sdx_%ssve", VectorSupport::mathname[op], ulf);
- StubRoutines::_vector_d_math[VectorSupport::VEC_SIZE_SCALABLE][op] = (address)os::dll_lookup(libsleef, ebuf);
- }
- }
-
- // Math vector stubs implemented with NEON for 64/128 bits vector size.
- for (int op = 0; op < VectorSupport::NUM_VECTOR_OP_MATH; op++) {
- int vop = VectorSupport::VECTOR_OP_MATH_START + op;
- // Skip "tanh" because there is performance regression
- if (vop == VectorSupport::VECTOR_OP_TANH) {
- continue;
- }
-
- // The native library does not support u10 level of "hypot".
- const char* ulf = (vop == VectorSupport::VECTOR_OP_HYPOT) ? "u05" : "u10";
-
- snprintf(ebuf, sizeof(ebuf), "%sf4_%sadvsimd", VectorSupport::mathname[op], ulf);
- StubRoutines::_vector_f_math[VectorSupport::VEC_SIZE_64][op] = (address)os::dll_lookup(libsleef, ebuf);
-
- snprintf(ebuf, sizeof(ebuf), "%sf4_%sadvsimd", VectorSupport::mathname[op], ulf);
- StubRoutines::_vector_f_math[VectorSupport::VEC_SIZE_128][op] = (address)os::dll_lookup(libsleef, ebuf);
-
- snprintf(ebuf, sizeof(ebuf), "%sd2_%sadvsimd", VectorSupport::mathname[op], ulf);
- StubRoutines::_vector_d_math[VectorSupport::VEC_SIZE_128][op] = (address)os::dll_lookup(libsleef, ebuf);
- }
- }
-
// Initialization
void generate_initial_stubs() {
// Generate initial stubs and initializes the entry points
@@ -9995,12 +11319,20 @@ class StubGenerator: public StubCodeGenerator {
StubRoutines::_montgomerySquare = g.generate_multiply();
}
- generate_vector_math_stubs();
-
#endif // COMPILER2
if (UseChaCha20Intrinsics) {
- StubRoutines::_chacha20Block = generate_chacha20Block_qrpar();
+ StubRoutines::_chacha20Block = generate_chacha20Block_blockpar();
+ }
+
+ if (UseKyberIntrinsics) {
+ StubRoutines::_kyberNtt = generate_kyberNtt();
+ StubRoutines::_kyberInverseNtt = generate_kyberInverseNtt();
+ StubRoutines::_kyberNttMult = generate_kyberNttMult();
+ StubRoutines::_kyberAddPoly_2 = generate_kyberAddPoly_2();
+ StubRoutines::_kyberAddPoly_3 = generate_kyberAddPoly_3();
+ StubRoutines::_kyber12To16 = generate_kyber12To16();
+ StubRoutines::_kyberBarrettReduce = generate_kyberBarrettReduce();
}
if (UseDilithiumIntrinsics) {
diff --git a/src/hotspot/cpu/aarch64/stubRoutines_aarch64.cpp b/src/hotspot/cpu/aarch64/stubRoutines_aarch64.cpp
index 536583ff40c0b..fab76c41303c7 100644
--- a/src/hotspot/cpu/aarch64/stubRoutines_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/stubRoutines_aarch64.cpp
@@ -48,6 +48,17 @@ STUBGEN_ARCH_ENTRIES_DO(DEFINE_ARCH_ENTRY, DEFINE_ARCH_ENTRY_INIT)
bool StubRoutines::aarch64::_completed = false;
+ATTRIBUTE_ALIGNED(64) uint16_t StubRoutines::aarch64::_kyberConsts[] =
+{
+ // Because we sometimes load these in pairs, montQInvModR, kyber_q
+ // and kyberBarrettMultiplier should stay together and in this order.
+ 0xF301, 0xF301, 0xF301, 0xF301, 0xF301, 0xF301, 0xF301, 0xF301, // montQInvModR
+ 0x0D01, 0x0D01, 0x0D01, 0x0D01, 0x0D01, 0x0D01, 0x0D01, 0x0D01, // kyber_q
+ 0x4EBF, 0x4EBF, 0x4EBF, 0x4EBF, 0x4EBF, 0x4EBF, 0x4EBF, 0x4EBF, // kyberBarrettMultiplier
+ 0x0200, 0x0200, 0x0200, 0x0200, 0x0200, 0x0200, 0x0200, 0x0200, // toMont((kyber_n / 2)^-1 (mod kyber_q))
+ 0x0549, 0x0549, 0x0549, 0x0549, 0x0549, 0x0549, 0x0549, 0x0549 // montRSquareModQ
+};
+
ATTRIBUTE_ALIGNED(64) uint32_t StubRoutines::aarch64::_dilithiumConsts[] =
{
58728449, 58728449, 58728449, 58728449, // montQInvModR
diff --git a/src/hotspot/cpu/aarch64/stubRoutines_aarch64.hpp b/src/hotspot/cpu/aarch64/stubRoutines_aarch64.hpp
index 857bb2ff10a91..4c942b9f8d81d 100644
--- a/src/hotspot/cpu/aarch64/stubRoutines_aarch64.hpp
+++ b/src/hotspot/cpu/aarch64/stubRoutines_aarch64.hpp
@@ -110,6 +110,7 @@ class aarch64 {
}
private:
+ static uint16_t _kyberConsts[];
static uint32_t _dilithiumConsts[];
static juint _crc_table[];
static jubyte _adler_table[];
diff --git a/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp b/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp
index 80c9437de6b7a..2db3b435abbc9 100644
--- a/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp
@@ -1978,11 +1978,11 @@ void TemplateInterpreterGenerator::generate_throw_exception() {
// preserve exception over this code sequence
__ pop_ptr(r0);
- __ str(r0, Address(rthread, JavaThread::vm_result_offset()));
+ __ str(r0, Address(rthread, JavaThread::vm_result_oop_offset()));
// remove the activation (without doing throws on illegalMonitorExceptions)
__ remove_activation(vtos, false, true, false);
// restore exception
- __ get_vm_result(r0, rthread);
+ __ get_vm_result_oop(r0, rthread);
// In between activations - previous activation type unknown yet
// compute continuation point - the continuation point expects the
diff --git a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp
index e50810486c80d..2cc9b39983ac1 100644
--- a/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/templateTable_aarch64.cpp
@@ -484,7 +484,7 @@ void TemplateTable::condy_helper(Label& Done)
__ mov(rarg, (int) bytecode());
__ call_VM(obj, entry, rarg);
- __ get_vm_result_2(flags, rthread);
+ __ get_vm_result_metadata(flags, rthread);
// VMr = obj = base address to find primitive value to push
// VMr2 = flags = (tos, off) using format of CPCE::_flags
@@ -3723,8 +3723,7 @@ void TemplateTable::checkcast()
__ push(atos); // save receiver for result, and for GC
call_VM(r0, CAST_FROM_FN_PTR(address, InterpreterRuntime::quicken_io_cc));
- // vm_result_2 has metadata result
- __ get_vm_result_2(r0, rthread);
+ __ get_vm_result_metadata(r0, rthread);
__ pop(r3); // restore receiver
__ b(resolved);
@@ -3777,8 +3776,7 @@ void TemplateTable::instanceof() {
__ push(atos); // save receiver for result, and for GC
call_VM(r0, CAST_FROM_FN_PTR(address, InterpreterRuntime::quicken_io_cc));
- // vm_result_2 has metadata result
- __ get_vm_result_2(r0, rthread);
+ __ get_vm_result_metadata(r0, rthread);
__ pop(r3); // restore receiver
__ verify_oop(r3);
__ load_klass(r3, r3);
diff --git a/src/hotspot/cpu/aarch64/vmStructs_aarch64.hpp b/src/hotspot/cpu/aarch64/vmStructs_aarch64.hpp
index bf9c965213cef..2ec901f6a2ed9 100644
--- a/src/hotspot/cpu/aarch64/vmStructs_aarch64.hpp
+++ b/src/hotspot/cpu/aarch64/vmStructs_aarch64.hpp
@@ -35,8 +35,7 @@
static_field(VM_Version, _rop_protection, bool) \
static_field(VM_Version, _pac_mask, uintptr_t)
-#define VM_TYPES_CPU(declare_type, declare_toplevel_type, declare_oop_type, declare_integer_type, declare_unsigned_integer_type) \
- declare_toplevel_type(VM_Version)
+#define VM_TYPES_CPU(declare_type, declare_toplevel_type, declare_oop_type, declare_integer_type, declare_unsigned_integer_type)
#define VM_INT_CONSTANTS_CPU(declare_constant, declare_preprocessor_constant)
diff --git a/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp b/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp
index 0f04fee79220a..6ed7a6be58552 100644
--- a/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp
@@ -161,6 +161,9 @@ void VM_Version::initialize() {
(_model == CPU_MODEL_AMPERE_1A || _model == CPU_MODEL_AMPERE_1B)) {
FLAG_SET_DEFAULT(CodeEntryAlignment, 32);
}
+ if (FLAG_IS_DEFAULT(AlwaysMergeDMB)) {
+ FLAG_SET_DEFAULT(AlwaysMergeDMB, false);
+ }
}
// ThunderX
@@ -414,13 +417,24 @@ void VM_Version::initialize() {
FLAG_SET_DEFAULT(UseChaCha20Intrinsics, false);
}
+ if (_features & CPU_ASIMD) {
+ if (FLAG_IS_DEFAULT(UseKyberIntrinsics)) {
+ UseKyberIntrinsics = true;
+ }
+ } else if (UseKyberIntrinsics) {
+ if (!FLAG_IS_DEFAULT(UseKyberIntrinsics)) {
+ warning("Kyber intrinsics require ASIMD instructions");
+ }
+ FLAG_SET_DEFAULT(UseKyberIntrinsics, false);
+ }
+
if (_features & CPU_ASIMD) {
if (FLAG_IS_DEFAULT(UseDilithiumIntrinsics)) {
UseDilithiumIntrinsics = true;
}
} else if (UseDilithiumIntrinsics) {
if (!FLAG_IS_DEFAULT(UseDilithiumIntrinsics)) {
- warning("Dilithium intrinsic requires ASIMD instructions");
+ warning("Dilithium intrinsics require ASIMD instructions");
}
FLAG_SET_DEFAULT(UseDilithiumIntrinsics, false);
}
@@ -628,6 +642,7 @@ void VM_Version::initialize() {
if (_model2) {
os::snprintf_checked(buf + buf_used_len, sizeof(buf) - buf_used_len, "(0x%03x)", _model2);
}
+ size_t features_offset = strnlen(buf, sizeof(buf));
#define ADD_FEATURE_IF_SUPPORTED(id, name, bit) \
do { \
if (VM_Version::supports_##name()) strcat(buf, ", " #name); \
@@ -635,7 +650,11 @@ void VM_Version::initialize() {
CPU_FEATURE_FLAGS(ADD_FEATURE_IF_SUPPORTED)
#undef ADD_FEATURE_IF_SUPPORTED
- _features_string = os::strdup(buf);
+ _cpu_info_string = os::strdup(buf);
+
+ _features_string = extract_features_string(_cpu_info_string,
+ strnlen(_cpu_info_string, sizeof(buf)),
+ features_offset);
}
#if defined(LINUX)
@@ -702,7 +721,7 @@ void VM_Version::initialize_cpu_information(void) {
int desc_len = snprintf(_cpu_desc, CPU_DETAILED_DESC_BUF_SIZE, "AArch64 ");
get_compatible_board(_cpu_desc + desc_len, CPU_DETAILED_DESC_BUF_SIZE - desc_len);
desc_len = (int)strlen(_cpu_desc);
- snprintf(_cpu_desc + desc_len, CPU_DETAILED_DESC_BUF_SIZE - desc_len, " %s", _features_string);
+ snprintf(_cpu_desc + desc_len, CPU_DETAILED_DESC_BUF_SIZE - desc_len, " %s", _cpu_info_string);
_initialized = true;
}
diff --git a/src/hotspot/cpu/aarch64/vm_version_aarch64.hpp b/src/hotspot/cpu/aarch64/vm_version_aarch64.hpp
index 04cf9c9c2a07c..373f8da540589 100644
--- a/src/hotspot/cpu/aarch64/vm_version_aarch64.hpp
+++ b/src/hotspot/cpu/aarch64/vm_version_aarch64.hpp
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 1997, 2024, Oracle and/or its affiliates. All rights reserved.
+ * Copyright (c) 1997, 2025, Oracle and/or its affiliates. All rights reserved.
* Copyright (c) 2014, 2020, Red Hat Inc. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
@@ -125,6 +125,8 @@ enum Ampere_CPU_Model {
decl(SHA2, sha256, 6) \
decl(CRC32, crc32, 7) \
decl(LSE, lse, 8) \
+ decl(FPHP, fphp, 9) \
+ decl(ASIMDHP, asimdhp, 10) \
decl(DCPOP, dcpop, 16) \
decl(SHA3, sha3, 17) \
decl(SHA512, sha512, 21) \
diff --git a/src/hotspot/cpu/arm/arm.ad b/src/hotspot/cpu/arm/arm.ad
index 486d51ac46353..4a0b557968caa 100644
--- a/src/hotspot/cpu/arm/arm.ad
+++ b/src/hotspot/cpu/arm/arm.ad
@@ -1238,11 +1238,11 @@ encode %{
enc_class save_last_PC %{
// preserve mark
address mark = __ inst_mark();
- debug_only(int off0 = __ offset());
+ DEBUG_ONLY(int off0 = __ offset());
int ret_addr_offset = as_MachCall()->ret_addr_offset();
__ adr(LR, mark + ret_addr_offset);
__ str(LR, Address(Rthread, JavaThread::last_Java_pc_offset()));
- debug_only(int off1 = __ offset());
+ DEBUG_ONLY(int off1 = __ offset());
assert(off1 - off0 == 2 * Assembler::InstructionSize, "correct size prediction");
// restore mark
__ set_inst_mark(mark);
@@ -1251,11 +1251,11 @@ encode %{
enc_class preserve_SP %{
// preserve mark
address mark = __ inst_mark();
- debug_only(int off0 = __ offset());
+ DEBUG_ONLY(int off0 = __ offset());
// FP is preserved across all calls, even compiled calls.
// Use it to preserve SP in places where the callee might change the SP.
__ mov(Rmh_SP_save, SP);
- debug_only(int off1 = __ offset());
+ DEBUG_ONLY(int off1 = __ offset());
assert(off1 - off0 == 4, "correct size prediction");
// restore mark
__ set_inst_mark(mark);
@@ -8992,7 +8992,8 @@ instruct ShouldNotReachHere( )
format %{ "ShouldNotReachHere" %}
ins_encode %{
if (is_reachable()) {
- __ stop(_halt_reason);
+ const char* str = __ code_string(_halt_reason);
+ __ stop(str);
}
%}
ins_pipe(tail_call);
diff --git a/src/hotspot/cpu/arm/c1_CodeStubs_arm.cpp b/src/hotspot/cpu/arm/c1_CodeStubs_arm.cpp
index bca6c7ca30cb8..5683bc59d5c07 100644
--- a/src/hotspot/cpu/arm/c1_CodeStubs_arm.cpp
+++ b/src/hotspot/cpu/arm/c1_CodeStubs_arm.cpp
@@ -59,7 +59,7 @@ void RangeCheckStub::emit_code(LIR_Assembler* ce) {
__ call(Runtime1::entry_for(C1StubId::predicate_failed_trap_id), relocInfo::runtime_call_type);
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
return;
}
// Pass the array index on stack because all registers must be preserved
@@ -91,7 +91,7 @@ void PredicateFailedStub::emit_code(LIR_Assembler* ce) {
__ call(Runtime1::entry_for(C1StubId::predicate_failed_trap_id), relocInfo::runtime_call_type);
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
}
void DivByZeroStub::emit_code(LIR_Assembler* ce) {
diff --git a/src/hotspot/cpu/arm/c1_Runtime1_arm.cpp b/src/hotspot/cpu/arm/c1_Runtime1_arm.cpp
index 949e985ab1eea..021b47148fa8c 100644
--- a/src/hotspot/cpu/arm/c1_Runtime1_arm.cpp
+++ b/src/hotspot/cpu/arm/c1_Runtime1_arm.cpp
@@ -70,11 +70,11 @@ int StubAssembler::call_RT(Register oop_result1, Register metadata_result, addre
if (oop_result1->is_valid()) {
assert_different_registers(oop_result1, R3, Rtemp);
- get_vm_result(oop_result1, Rtemp);
+ get_vm_result_oop(oop_result1, Rtemp);
}
if (metadata_result->is_valid()) {
assert_different_registers(metadata_result, R3, Rtemp);
- get_vm_result_2(metadata_result, Rtemp);
+ get_vm_result_metadata(metadata_result, Rtemp);
}
// Check for pending exception
diff --git a/src/hotspot/cpu/arm/gc/g1/g1BarrierSetAssembler_arm.cpp b/src/hotspot/cpu/arm/gc/g1/g1BarrierSetAssembler_arm.cpp
index 466dcc8fe66c1..049477cda7658 100644
--- a/src/hotspot/cpu/arm/gc/g1/g1BarrierSetAssembler_arm.cpp
+++ b/src/hotspot/cpu/arm/gc/g1/g1BarrierSetAssembler_arm.cpp
@@ -26,7 +26,6 @@
#include "gc/g1/g1BarrierSet.hpp"
#include "gc/g1/g1BarrierSetAssembler.hpp"
#include "gc/g1/g1BarrierSetRuntime.hpp"
-#include "gc/g1/g1ThreadLocalData.hpp"
#include "gc/g1/g1CardTable.hpp"
#include "gc/g1/g1HeapRegion.hpp"
#include "gc/g1/g1ThreadLocalData.hpp"
diff --git a/src/hotspot/cpu/arm/gc/shared/barrierSetNMethod_arm.cpp b/src/hotspot/cpu/arm/gc/shared/barrierSetNMethod_arm.cpp
index 224a499ff5420..52d71ca65c29d 100644
--- a/src/hotspot/cpu/arm/gc/shared/barrierSetNMethod_arm.cpp
+++ b/src/hotspot/cpu/arm/gc/shared/barrierSetNMethod_arm.cpp
@@ -29,8 +29,8 @@
#include "memory/resourceArea.hpp"
#include "runtime/frame.inline.hpp"
#include "runtime/javaThread.hpp"
-#include "runtime/sharedRuntime.hpp"
#include "runtime/registerMap.hpp"
+#include "runtime/sharedRuntime.hpp"
#include "utilities/align.hpp"
#include "utilities/debug.hpp"
@@ -72,7 +72,7 @@ void NativeNMethodBarrier::verify() const {
static NativeNMethodBarrier* native_nmethod_barrier(nmethod* nm) {
address barrier_address = nm->code_begin() + nm->frame_complete_offset() - entry_barrier_bytes;
NativeNMethodBarrier* barrier = reinterpret_cast(barrier_address);
- debug_only(barrier->verify());
+ DEBUG_ONLY(barrier->verify());
return barrier;
}
diff --git a/src/hotspot/cpu/arm/macroAssembler_arm.cpp b/src/hotspot/cpu/arm/macroAssembler_arm.cpp
index 638b3a5404c25..3dcde7d898d08 100644
--- a/src/hotspot/cpu/arm/macroAssembler_arm.cpp
+++ b/src/hotspot/cpu/arm/macroAssembler_arm.cpp
@@ -424,7 +424,7 @@ void MacroAssembler::call_VM_helper(Register oop_result, address entry_point, in
// get oop result if there is one and reset the value in the thread
if (oop_result->is_valid()) {
- get_vm_result(oop_result, tmp);
+ get_vm_result_oop(oop_result, tmp);
}
}
@@ -528,17 +528,17 @@ void MacroAssembler::call_VM_leaf(address entry_point, Register arg_1, Register
call_VM_leaf_helper(entry_point, 4);
}
-void MacroAssembler::get_vm_result(Register oop_result, Register tmp) {
+void MacroAssembler::get_vm_result_oop(Register oop_result, Register tmp) {
assert_different_registers(oop_result, tmp);
- ldr(oop_result, Address(Rthread, JavaThread::vm_result_offset()));
- str(zero_register(tmp), Address(Rthread, JavaThread::vm_result_offset()));
+ ldr(oop_result, Address(Rthread, JavaThread::vm_result_oop_offset()));
+ str(zero_register(tmp), Address(Rthread, JavaThread::vm_result_oop_offset()));
verify_oop(oop_result);
}
-void MacroAssembler::get_vm_result_2(Register metadata_result, Register tmp) {
+void MacroAssembler::get_vm_result_metadata(Register metadata_result, Register tmp) {
assert_different_registers(metadata_result, tmp);
- ldr(metadata_result, Address(Rthread, JavaThread::vm_result_2_offset()));
- str(zero_register(tmp), Address(Rthread, JavaThread::vm_result_2_offset()));
+ ldr(metadata_result, Address(Rthread, JavaThread::vm_result_metadata_offset()));
+ str(zero_register(tmp), Address(Rthread, JavaThread::vm_result_metadata_offset()));
}
void MacroAssembler::add_rc(Register dst, Register arg1, RegisterOrConstant arg2) {
diff --git a/src/hotspot/cpu/arm/macroAssembler_arm.hpp b/src/hotspot/cpu/arm/macroAssembler_arm.hpp
index 621f0101432e7..d60b38e42dbea 100644
--- a/src/hotspot/cpu/arm/macroAssembler_arm.hpp
+++ b/src/hotspot/cpu/arm/macroAssembler_arm.hpp
@@ -257,8 +257,8 @@ class MacroAssembler: public Assembler {
void call_VM_leaf(address entry_point, Register arg_1, Register arg_2, Register arg_3);
void call_VM_leaf(address entry_point, Register arg_1, Register arg_2, Register arg_3, Register arg_4);
- void get_vm_result(Register oop_result, Register tmp);
- void get_vm_result_2(Register metadata_result, Register tmp);
+ void get_vm_result_oop(Register oop_result, Register tmp);
+ void get_vm_result_metadata(Register metadata_result, Register tmp);
// Always sets/resets sp, which default to SP if (last_sp == noreg)
// Optionally sets/resets fp (use noreg to avoid setting it)
diff --git a/src/hotspot/cpu/arm/runtime_arm.cpp b/src/hotspot/cpu/arm/runtime_arm.cpp
index 20c1bc199d3e7..615a63eac19af 100644
--- a/src/hotspot/cpu/arm/runtime_arm.cpp
+++ b/src/hotspot/cpu/arm/runtime_arm.cpp
@@ -54,6 +54,9 @@ UncommonTrapBlob* OptoRuntime::generate_uncommon_trap_blob() {
// Measured 8/7/03 at 660 in 32bit debug build
CodeBuffer buffer(name, 2000, 512);
#endif
+ if (buffer.blob() == nullptr) {
+ return nullptr;
+ }
// bypassed when code generation useless
MacroAssembler* masm = new MacroAssembler(&buffer);
const Register Rublock = R6;
@@ -209,6 +212,9 @@ ExceptionBlob* OptoRuntime::generate_exception_blob() {
// Measured 8/7/03 at 256 in 32bit debug build
const char* name = OptoRuntime::stub_name(OptoStubId::exception_id);
CodeBuffer buffer(name, 600, 512);
+ if (buffer.blob() == nullptr) {
+ return nullptr;
+ }
MacroAssembler* masm = new MacroAssembler(&buffer);
int framesize_in_words = 2; // FP + LR
diff --git a/src/hotspot/cpu/arm/sharedRuntime_arm.cpp b/src/hotspot/cpu/arm/sharedRuntime_arm.cpp
index c63d72920a5b6..8ba847e7e3288 100644
--- a/src/hotspot/cpu/arm/sharedRuntime_arm.cpp
+++ b/src/hotspot/cpu/arm/sharedRuntime_arm.cpp
@@ -612,12 +612,12 @@ static void gen_c2i_adapter(MacroAssembler *masm,
}
-AdapterHandlerEntry* SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm,
- int total_args_passed,
- int comp_args_on_stack,
- const BasicType *sig_bt,
- const VMRegPair *regs,
- AdapterFingerPrint* fingerprint) {
+void SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm,
+ int total_args_passed,
+ int comp_args_on_stack,
+ const BasicType *sig_bt,
+ const VMRegPair *regs,
+ AdapterHandlerEntry* handler) {
address i2c_entry = __ pc();
gen_i2c_adapter(masm, total_args_passed, comp_args_on_stack, sig_bt, regs);
@@ -637,7 +637,8 @@ AdapterHandlerEntry* SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm
address c2i_entry = __ pc();
gen_c2i_adapter(masm, total_args_passed, comp_args_on_stack, sig_bt, regs, skip_fixup);
- return AdapterHandlerLibrary::new_entry(fingerprint, i2c_entry, c2i_entry, c2i_unverified_entry);
+ handler->set_entry_points(i2c_entry, c2i_entry, c2i_unverified_entry, nullptr);
+ return;
}
@@ -1717,7 +1718,7 @@ RuntimeStub* SharedRuntime::generate_resolve_blob(SharedStubId id, address desti
// Overwrite saved register values
// Place metadata result of VM call into Rmethod
- __ get_vm_result_2(R1, Rtemp);
+ __ get_vm_result_metadata(R1, Rtemp);
__ str(R1, Address(SP, RegisterSaver::Rmethod_offset * wordSize));
// Place target address (VM call result) into Rtemp
@@ -1730,7 +1731,7 @@ RuntimeStub* SharedRuntime::generate_resolve_blob(SharedStubId id, address desti
RegisterSaver::restore_live_registers(masm);
const Register Rzero = __ zero_register(Rtemp);
- __ str(Rzero, Address(Rthread, JavaThread::vm_result_2_offset()));
+ __ str(Rzero, Address(Rthread, JavaThread::vm_result_metadata_offset()));
__ mov(Rexception_pc, LR);
__ jump(StubRoutines::forward_exception_entry(), relocInfo::runtime_call_type, Rtemp);
diff --git a/src/hotspot/cpu/arm/templateInterpreterGenerator_arm.cpp b/src/hotspot/cpu/arm/templateInterpreterGenerator_arm.cpp
index da226c09f3cd6..30d88a4db91fd 100644
--- a/src/hotspot/cpu/arm/templateInterpreterGenerator_arm.cpp
+++ b/src/hotspot/cpu/arm/templateInterpreterGenerator_arm.cpp
@@ -1467,11 +1467,11 @@ void TemplateInterpreterGenerator::generate_throw_exception() {
// preserve exception over this code sequence
__ pop_ptr(R0_tos);
- __ str(R0_tos, Address(Rthread, JavaThread::vm_result_offset()));
+ __ str(R0_tos, Address(Rthread, JavaThread::vm_result_oop_offset()));
// remove the activation (without doing throws on illegalMonitorExceptions)
__ remove_activation(vtos, Rexception_pc, false, true, false);
// restore exception
- __ get_vm_result(Rexception_obj, Rtemp);
+ __ get_vm_result_oop(Rexception_obj, Rtemp);
// In between activations - previous activation type unknown yet
// compute continuation point - the continuation point expects
diff --git a/src/hotspot/cpu/arm/templateTable_arm.cpp b/src/hotspot/cpu/arm/templateTable_arm.cpp
index bbe5713090af5..50e3761dcb917 100644
--- a/src/hotspot/cpu/arm/templateTable_arm.cpp
+++ b/src/hotspot/cpu/arm/templateTable_arm.cpp
@@ -538,7 +538,7 @@ void TemplateTable::condy_helper(Label& Done)
__ mov(rtmp, (int) bytecode());
__ call_VM(obj, CAST_FROM_FN_PTR(address, InterpreterRuntime::resolve_ldc), rtmp);
- __ get_vm_result_2(flags, rtmp);
+ __ get_vm_result_metadata(flags, rtmp);
// VMr = obj = base address to find primitive value to push
// VMr2 = flags = (tos, off) using format of CPCE::_flags
@@ -4143,8 +4143,7 @@ void TemplateTable::checkcast() {
__ push(atos);
call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::quicken_io_cc));
- // vm_result_2 has metadata result
- __ get_vm_result_2(Rsuper, Robj);
+ __ get_vm_result_metadata(Rsuper, Robj);
__ pop_ptr(Robj);
__ b(resolved);
@@ -4214,8 +4213,7 @@ void TemplateTable::instanceof() {
__ push(atos);
call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::quicken_io_cc));
- // vm_result_2 has metadata result
- __ get_vm_result_2(Rsuper, Robj);
+ __ get_vm_result_metadata(Rsuper, Robj);
__ pop_ptr(Robj);
__ b(resolved);
diff --git a/src/hotspot/cpu/arm/vm_version_arm_32.cpp b/src/hotspot/cpu/arm/vm_version_arm_32.cpp
index 148786a55da41..d094193603567 100644
--- a/src/hotspot/cpu/arm/vm_version_arm_32.cpp
+++ b/src/hotspot/cpu/arm/vm_version_arm_32.cpp
@@ -295,7 +295,7 @@ void VM_Version::initialize() {
(has_multiprocessing_extensions() ? ", mp_ext" : ""));
// buf is started with ", " or is empty
- _features_string = os::strdup(buf);
+ _cpu_info_string = os::strdup(buf);
if (has_simd()) {
if (FLAG_IS_DEFAULT(UsePopCountInstruction)) {
@@ -363,6 +363,6 @@ void VM_Version::initialize_cpu_information(void) {
_no_of_threads = _no_of_cores;
_no_of_sockets = _no_of_cores;
snprintf(_cpu_name, CPU_TYPE_DESC_BUF_SIZE - 1, "ARM%d", _arm_arch);
- snprintf(_cpu_desc, CPU_DETAILED_DESC_BUF_SIZE, "%s", _features_string);
+ snprintf(_cpu_desc, CPU_DETAILED_DESC_BUF_SIZE, "%s", _cpu_info_string);
_initialized = true;
}
diff --git a/src/hotspot/cpu/ppc/c1_CodeStubs_ppc.cpp b/src/hotspot/cpu/ppc/c1_CodeStubs_ppc.cpp
index d4f5faa29a869..a390a6eeed410 100644
--- a/src/hotspot/cpu/ppc/c1_CodeStubs_ppc.cpp
+++ b/src/hotspot/cpu/ppc/c1_CodeStubs_ppc.cpp
@@ -74,7 +74,7 @@ void RangeCheckStub::emit_code(LIR_Assembler* ce) {
__ bctrl();
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ illtrap());
+ DEBUG_ONLY(__ illtrap());
return;
}
@@ -98,7 +98,7 @@ void RangeCheckStub::emit_code(LIR_Assembler* ce) {
__ bctrl();
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ illtrap());
+ DEBUG_ONLY(__ illtrap());
}
@@ -115,7 +115,7 @@ void PredicateFailedStub::emit_code(LIR_Assembler* ce) {
__ bctrl();
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ illtrap());
+ DEBUG_ONLY(__ illtrap());
}
@@ -156,7 +156,7 @@ void DivByZeroStub::emit_code(LIR_Assembler* ce) {
__ bctrl();
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ illtrap());
+ DEBUG_ONLY(__ illtrap());
}
@@ -179,7 +179,7 @@ void ImplicitNullCheckStub::emit_code(LIR_Assembler* ce) {
__ bctrl();
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ illtrap());
+ DEBUG_ONLY(__ illtrap());
}
@@ -193,7 +193,7 @@ void SimpleExceptionStub::emit_code(LIR_Assembler* ce) {
__ mtctr(R0);
__ bctrl();
ce->add_call_info_here(_info);
- debug_only( __ illtrap(); )
+ DEBUG_ONLY( __ illtrap(); )
}
@@ -441,7 +441,7 @@ void DeoptimizeStub::emit_code(LIR_Assembler* ce) {
__ load_const_optimized(R0, _trap_request); // Pass trap request in R0.
__ bctrl();
ce->add_call_info_here(_info);
- debug_only(__ illtrap());
+ DEBUG_ONLY(__ illtrap());
}
diff --git a/src/hotspot/cpu/ppc/c1_FrameMap_ppc.cpp b/src/hotspot/cpu/ppc/c1_FrameMap_ppc.cpp
index e4684613e2589..8ce324a570bd9 100644
--- a/src/hotspot/cpu/ppc/c1_FrameMap_ppc.cpp
+++ b/src/hotspot/cpu/ppc/c1_FrameMap_ppc.cpp
@@ -189,7 +189,7 @@ LIR_Opr FrameMap::_caller_save_fpu_regs[] = {};
FloatRegister FrameMap::nr2floatreg (int rnr) {
assert(_init_done, "tables not initialized");
- debug_only(fpu_range_check(rnr);)
+ DEBUG_ONLY(fpu_range_check(rnr);)
return _fpu_regs[rnr];
}
diff --git a/src/hotspot/cpu/ppc/c1_MacroAssembler_ppc.cpp b/src/hotspot/cpu/ppc/c1_MacroAssembler_ppc.cpp
index ac9c5984de050..77d3653aefdb8 100644
--- a/src/hotspot/cpu/ppc/c1_MacroAssembler_ppc.cpp
+++ b/src/hotspot/cpu/ppc/c1_MacroAssembler_ppc.cpp
@@ -83,16 +83,17 @@ void C1_MacroAssembler::lock_object(Register Rmark, Register Roop, Register Rbox
// Save object being locked into the BasicObjectLock...
std(Roop, in_bytes(BasicObjectLock::obj_offset()), Rbox);
- if (DiagnoseSyncOnValueBasedClasses != 0) {
- load_klass(Rscratch, Roop);
- lbz(Rscratch, in_bytes(Klass::misc_flags_offset()), Rscratch);
- testbitdi(CR0, R0, Rscratch, exact_log2(KlassFlags::_misc_is_value_based_class));
- bne(CR0, slow_int);
- }
-
if (LockingMode == LM_LIGHTWEIGHT) {
lightweight_lock(Rbox, Roop, Rmark, Rscratch, slow_int);
} else if (LockingMode == LM_LEGACY) {
+
+ if (DiagnoseSyncOnValueBasedClasses != 0) {
+ load_klass(Rscratch, Roop);
+ lbz(Rscratch, in_bytes(Klass::misc_flags_offset()), Rscratch);
+ testbitdi(CR0, R0, Rscratch, exact_log2(KlassFlags::_misc_is_value_based_class));
+ bne(CR0, slow_int);
+ }
+
// ... and mark it unlocked.
ori(Rmark, Rmark, markWord::unlocked_value);
diff --git a/src/hotspot/cpu/ppc/c1_Runtime1_ppc.cpp b/src/hotspot/cpu/ppc/c1_Runtime1_ppc.cpp
index 11c01dcdc60e6..79b129c08ae22 100644
--- a/src/hotspot/cpu/ppc/c1_Runtime1_ppc.cpp
+++ b/src/hotspot/cpu/ppc/c1_Runtime1_ppc.cpp
@@ -82,10 +82,10 @@ int StubAssembler::call_RT(Register oop_result1, Register metadata_result,
if (oop_result1->is_valid() || metadata_result->is_valid()) {
li(R0, 0);
if (oop_result1->is_valid()) {
- std(R0, in_bytes(JavaThread::vm_result_offset()), R16_thread);
+ std(R0, in_bytes(JavaThread::vm_result_oop_offset()), R16_thread);
}
if (metadata_result->is_valid()) {
- std(R0, in_bytes(JavaThread::vm_result_2_offset()), R16_thread);
+ std(R0, in_bytes(JavaThread::vm_result_metadata_offset()), R16_thread);
}
}
@@ -112,10 +112,10 @@ int StubAssembler::call_RT(Register oop_result1, Register metadata_result,
// Get oop results if there are any and reset the values in the thread.
if (oop_result1->is_valid()) {
- get_vm_result(oop_result1);
+ get_vm_result_oop(oop_result1);
}
if (metadata_result->is_valid()) {
- get_vm_result_2(metadata_result);
+ get_vm_result_metadata(metadata_result);
}
return (int)(return_pc - code_section()->start());
diff --git a/src/hotspot/cpu/ppc/gc/shared/barrierSetNMethod_ppc.cpp b/src/hotspot/cpu/ppc/gc/shared/barrierSetNMethod_ppc.cpp
index 19084ed27c7c0..d3bb9cc3c04da 100644
--- a/src/hotspot/cpu/ppc/gc/shared/barrierSetNMethod_ppc.cpp
+++ b/src/hotspot/cpu/ppc/gc/shared/barrierSetNMethod_ppc.cpp
@@ -23,8 +23,8 @@
*/
#include "code/codeBlob.hpp"
-#include "code/nmethod.hpp"
#include "code/nativeInst.hpp"
+#include "code/nmethod.hpp"
#include "gc/shared/barrierSet.hpp"
#include "gc/shared/barrierSetAssembler.hpp"
#include "gc/shared/barrierSetNMethod.hpp"
@@ -108,7 +108,7 @@ static NativeNMethodBarrier* get_nmethod_barrier(nmethod* nm) {
}
auto barrier = reinterpret_cast(barrier_address);
- debug_only(barrier->verify());
+ DEBUG_ONLY(barrier->verify());
return barrier;
}
diff --git a/src/hotspot/cpu/ppc/gc/shenandoah/c1/shenandoahBarrierSetC1_ppc.cpp b/src/hotspot/cpu/ppc/gc/shenandoah/c1/shenandoahBarrierSetC1_ppc.cpp
index 48422bc66212e..5b24259103f53 100644
--- a/src/hotspot/cpu/ppc/gc/shenandoah/c1/shenandoahBarrierSetC1_ppc.cpp
+++ b/src/hotspot/cpu/ppc/gc/shenandoah/c1/shenandoahBarrierSetC1_ppc.cpp
@@ -26,9 +26,9 @@
#include "asm/macroAssembler.inline.hpp"
#include "c1/c1_LIRAssembler.hpp"
#include "c1/c1_MacroAssembler.hpp"
+#include "gc/shenandoah/c1/shenandoahBarrierSetC1.hpp"
#include "gc/shenandoah/shenandoahBarrierSet.hpp"
#include "gc/shenandoah/shenandoahBarrierSetAssembler.hpp"
-#include "gc/shenandoah/c1/shenandoahBarrierSetC1.hpp"
#define __ masm->masm()->
diff --git a/src/hotspot/cpu/ppc/gc/shenandoah/shenandoahBarrierSetAssembler_ppc.cpp b/src/hotspot/cpu/ppc/gc/shenandoah/shenandoahBarrierSetAssembler_ppc.cpp
index 842201e158489..ec5b98bd4c516 100644
--- a/src/hotspot/cpu/ppc/gc/shenandoah/shenandoahBarrierSetAssembler_ppc.cpp
+++ b/src/hotspot/cpu/ppc/gc/shenandoah/shenandoahBarrierSetAssembler_ppc.cpp
@@ -24,8 +24,10 @@
*/
#include "asm/macroAssembler.inline.hpp"
-#include "gc/shared/gcArguments.hpp"
#include "gc/shared/gc_globals.hpp"
+#include "gc/shared/gcArguments.hpp"
+#include "gc/shenandoah/heuristics/shenandoahHeuristics.hpp"
+#include "gc/shenandoah/mode/shenandoahMode.hpp"
#include "gc/shenandoah/shenandoahBarrierSet.hpp"
#include "gc/shenandoah/shenandoahBarrierSetAssembler.hpp"
#include "gc/shenandoah/shenandoahForwarding.hpp"
@@ -34,8 +36,6 @@
#include "gc/shenandoah/shenandoahHeapRegion.hpp"
#include "gc/shenandoah/shenandoahRuntime.hpp"
#include "gc/shenandoah/shenandoahThreadLocalData.hpp"
-#include "gc/shenandoah/heuristics/shenandoahHeuristics.hpp"
-#include "gc/shenandoah/mode/shenandoahMode.hpp"
#include "interpreter/interpreter.hpp"
#include "macroAssembler_ppc.hpp"
#include "runtime/javaThread.hpp"
diff --git a/src/hotspot/cpu/ppc/gc/z/zAddress_ppc.cpp b/src/hotspot/cpu/ppc/gc/z/zAddress_ppc.cpp
index 28a57b2dc293f..f3a7a948f7021 100644
--- a/src/hotspot/cpu/ppc/gc/z/zAddress_ppc.cpp
+++ b/src/hotspot/cpu/ppc/gc/z/zAddress_ppc.cpp
@@ -21,8 +21,8 @@
* questions.
*/
-#include "gc/shared/gcLogPrecious.hpp"
#include "gc/shared/gc_globals.hpp"
+#include "gc/shared/gcLogPrecious.hpp"
#include "gc/z/zAddress.inline.hpp"
#include "gc/z/zGlobals.hpp"
#include "runtime/globals.hpp"
@@ -92,7 +92,7 @@ size_t ZPlatformAddressOffsetBits() {
static const size_t valid_max_address_offset_bits = probe_valid_max_address_bit() + 1;
const size_t max_address_offset_bits = valid_max_address_offset_bits - 3;
const size_t min_address_offset_bits = max_address_offset_bits - 2;
- const size_t address_offset = round_up_power_of_2(MaxHeapSize * ZVirtualToPhysicalRatio);
+ const size_t address_offset = ZGlobalsPointers::min_address_offset_request();
const size_t address_offset_bits = log2i_exact(address_offset);
return clamp(address_offset_bits, min_address_offset_bits, max_address_offset_bits);
}
diff --git a/src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp b/src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp
index 7a75dfd3de18b..b51a0739d63ed 100644
--- a/src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp
+++ b/src/hotspot/cpu/ppc/interp_masm_ppc_64.cpp
@@ -958,17 +958,18 @@ void InterpreterMacroAssembler::lock_object(Register monitor, Register object) {
// markWord displaced_header = obj->mark().set_unlocked();
- if (DiagnoseSyncOnValueBasedClasses != 0) {
- load_klass(tmp, object);
- lbz(tmp, in_bytes(Klass::misc_flags_offset()), tmp);
- testbitdi(CR0, R0, tmp, exact_log2(KlassFlags::_misc_is_value_based_class));
- bne(CR0, slow_case);
- }
-
if (LockingMode == LM_LIGHTWEIGHT) {
lightweight_lock(monitor, object, header, tmp, slow_case);
b(done);
} else if (LockingMode == LM_LEGACY) {
+
+ if (DiagnoseSyncOnValueBasedClasses != 0) {
+ load_klass(tmp, object);
+ lbz(tmp, in_bytes(Klass::misc_flags_offset()), tmp);
+ testbitdi(CR0, R0, tmp, exact_log2(KlassFlags::_misc_is_value_based_class));
+ bne(CR0, slow_case);
+ }
+
// Load markWord from object into header.
ld(header, oopDesc::mark_offset_in_bytes(), object);
diff --git a/src/hotspot/cpu/ppc/macroAssembler_ppc.cpp b/src/hotspot/cpu/ppc/macroAssembler_ppc.cpp
index 1786b13d33291..ca0a1344d143c 100644
--- a/src/hotspot/cpu/ppc/macroAssembler_ppc.cpp
+++ b/src/hotspot/cpu/ppc/macroAssembler_ppc.cpp
@@ -1284,13 +1284,7 @@ int MacroAssembler::ic_check(int end_alignment) {
if (use_trap_based_null_check) {
trap_null_check(receiver);
}
- if (UseCompactObjectHeaders) {
- load_narrow_klass_compact(tmp1, receiver);
- } else if (UseCompressedClassPointers) {
- lwz(tmp1, oopDesc::klass_offset_in_bytes(), receiver);
- } else {
- ld(tmp1, oopDesc::klass_offset_in_bytes(), receiver);
- }
+ load_klass_no_decode(tmp1, receiver); // 2 instructions with UseCompactObjectHeaders
ld(tmp2, in_bytes(CompiledICData::speculated_klass_offset()), data);
trap_ic_miss_check(tmp1, tmp2);
@@ -1306,11 +1300,7 @@ int MacroAssembler::ic_check(int end_alignment) {
cmpdi(CR0, receiver, 0);
beqctr(CR0);
}
- if (UseCompressedClassPointers) {
- lwz(tmp1, oopDesc::klass_offset_in_bytes(), receiver);
- } else {
- ld(tmp1, oopDesc::klass_offset_in_bytes(), receiver);
- }
+ load_klass_no_decode(tmp1, receiver); // 2 instructions with UseCompactObjectHeaders
ld(tmp2, in_bytes(CompiledICData::speculated_klass_offset()), data);
cmpd(CR0, tmp1, tmp2);
bnectr(CR0);
@@ -1347,7 +1337,7 @@ void MacroAssembler::call_VM_base(Register oop_result,
// Get oop result if there is one and reset the value in the thread.
if (oop_result->is_valid()) {
- get_vm_result(oop_result);
+ get_vm_result_oop(oop_result);
}
_last_calls_return_pc = return_pc;
@@ -3010,7 +3000,7 @@ void MacroAssembler::compiler_fast_lock_lightweight_object(ConditionRegister fla
Label slow_path;
if (UseObjectMonitorTable) {
- // Clear cache in case fast locking succeeds.
+ // Clear cache in case fast locking succeeds or we need to take the slow-path.
li(tmp1, 0);
std(tmp1, in_bytes(BasicObjectLock::lock_offset()) + BasicLock::object_monitor_cache_offset_in_bytes(), box);
}
@@ -3435,34 +3425,34 @@ void MacroAssembler::set_top_ijava_frame_at_SP_as_last_Java_frame(Register sp, R
set_last_Java_frame(/*sp=*/sp, /*pc=*/tmp1);
}
-void MacroAssembler::get_vm_result(Register oop_result) {
+void MacroAssembler::get_vm_result_oop(Register oop_result) {
// Read:
// R16_thread
- // R16_thread->in_bytes(JavaThread::vm_result_offset())
+ // R16_thread->in_bytes(JavaThread::vm_result_oop_offset())
//
// Updated:
// oop_result
- // R16_thread->in_bytes(JavaThread::vm_result_offset())
+ // R16_thread->in_bytes(JavaThread::vm_result_oop_offset())
- ld(oop_result, in_bytes(JavaThread::vm_result_offset()), R16_thread);
+ ld(oop_result, in_bytes(JavaThread::vm_result_oop_offset()), R16_thread);
li(R0, 0);
- std(R0, in_bytes(JavaThread::vm_result_offset()), R16_thread);
+ std(R0, in_bytes(JavaThread::vm_result_oop_offset()), R16_thread);
verify_oop(oop_result, FILE_AND_LINE);
}
-void MacroAssembler::get_vm_result_2(Register metadata_result) {
+void MacroAssembler::get_vm_result_metadata(Register metadata_result) {
// Read:
// R16_thread
- // R16_thread->in_bytes(JavaThread::vm_result_2_offset())
+ // R16_thread->in_bytes(JavaThread::vm_result_metadata_offset())
//
// Updated:
// metadata_result
- // R16_thread->in_bytes(JavaThread::vm_result_2_offset())
+ // R16_thread->in_bytes(JavaThread::vm_result_metadata_offset())
- ld(metadata_result, in_bytes(JavaThread::vm_result_2_offset()), R16_thread);
+ ld(metadata_result, in_bytes(JavaThread::vm_result_metadata_offset()), R16_thread);
li(R0, 0);
- std(R0, in_bytes(JavaThread::vm_result_2_offset()), R16_thread);
+ std(R0, in_bytes(JavaThread::vm_result_metadata_offset()), R16_thread);
}
Register MacroAssembler::encode_klass_not_null(Register dst, Register src) {
@@ -3536,18 +3526,23 @@ void MacroAssembler::decode_klass_not_null(Register dst, Register src) {
}
}
-void MacroAssembler::load_klass(Register dst, Register src) {
+void MacroAssembler::load_klass_no_decode(Register dst, Register src) {
if (UseCompactObjectHeaders) {
load_narrow_klass_compact(dst, src);
- decode_klass_not_null(dst);
} else if (UseCompressedClassPointers) {
lwz(dst, oopDesc::klass_offset_in_bytes(), src);
- decode_klass_not_null(dst);
} else {
ld(dst, oopDesc::klass_offset_in_bytes(), src);
}
}
+void MacroAssembler::load_klass(Register dst, Register src) {
+ load_klass_no_decode(dst, src);
+ if (UseCompressedClassPointers) { // also true for UseCompactObjectHeaders
+ decode_klass_not_null(dst);
+ }
+}
+
// Loads the obj's Klass* into dst.
// Preserves all registers (incl src, rscratch1 and rscratch2).
// Input:
@@ -5004,19 +4999,27 @@ void MacroAssembler::atomically_flip_locked_state(bool is_unlock, Register obj,
// - t1, t2: temporary register
void MacroAssembler::lightweight_lock(Register box, Register obj, Register t1, Register t2, Label& slow) {
assert(LockingMode == LM_LIGHTWEIGHT, "only used with new lightweight locking");
- assert_different_registers(box, obj, t1, t2);
+ assert_different_registers(box, obj, t1, t2, R0);
Label push;
- const Register top = t1;
- const Register mark = t2;
const Register t = R0;
if (UseObjectMonitorTable) {
- // Clear cache in case fast locking succeeds.
+ // Clear cache in case fast locking succeeds or we need to take the slow-path.
li(t, 0);
std(t, in_bytes(BasicObjectLock::lock_offset()) + BasicLock::object_monitor_cache_offset_in_bytes(), box);
}
+ if (DiagnoseSyncOnValueBasedClasses != 0) {
+ load_klass(t1, obj);
+ lbz(t1, in_bytes(Klass::misc_flags_offset()), t1);
+ testbitdi(CR0, R0, t1, exact_log2(KlassFlags::_misc_is_value_based_class));
+ bne(CR0, slow);
+ }
+
+ const Register top = t1;
+ const Register mark = t2;
+
// Check if the lock-stack is full.
lwz(top, in_bytes(JavaThread::lock_stack_top_offset()), R16_thread);
cmplwi(CR0, top, LockStack::end_offset());
diff --git a/src/hotspot/cpu/ppc/macroAssembler_ppc.hpp b/src/hotspot/cpu/ppc/macroAssembler_ppc.hpp
index 69570517866d6..7e2925ace26c8 100644
--- a/src/hotspot/cpu/ppc/macroAssembler_ppc.hpp
+++ b/src/hotspot/cpu/ppc/macroAssembler_ppc.hpp
@@ -745,8 +745,8 @@ class MacroAssembler: public Assembler {
void set_top_ijava_frame_at_SP_as_last_Java_frame(Register sp, Register tmp1, Label* jpc = nullptr);
// Read vm result from thread: oop_result = R16_thread->result;
- void get_vm_result (Register oop_result);
- void get_vm_result_2(Register metadata_result);
+ void get_vm_result_oop(Register oop_result);
+ void get_vm_result_metadata(Register metadata_result);
static bool needs_explicit_null_check(intptr_t offset);
static bool uses_implicit_null_check(void* address);
@@ -802,6 +802,7 @@ class MacroAssembler: public Assembler {
inline void decode_heap_oop(Register d);
// Load/Store klass oop from klass field. Compress.
+ void load_klass_no_decode(Register dst, Register src);
void load_klass(Register dst, Register src);
void load_narrow_klass_compact(Register dst, Register src);
void cmp_klass(ConditionRegister dst, Register obj, Register klass, Register tmp, Register tmp2);
diff --git a/src/hotspot/cpu/ppc/ppc.ad b/src/hotspot/cpu/ppc/ppc.ad
index 022a70d52a2b0..07d681e89823e 100644
--- a/src/hotspot/cpu/ppc/ppc.ad
+++ b/src/hotspot/cpu/ppc/ppc.ad
@@ -14699,7 +14699,8 @@ instruct ShouldNotReachHere() %{
format %{ "ShouldNotReachHere" %}
ins_encode %{
if (is_reachable()) {
- __ stop(_halt_reason);
+ const char* str = __ code_string(_halt_reason);
+ __ stop(str);
}
%}
ins_pipe(pipe_class_default);
diff --git a/src/hotspot/cpu/ppc/runtime_ppc.cpp b/src/hotspot/cpu/ppc/runtime_ppc.cpp
index 94e8c08ebf5fc..6d9a1dfcb1ea8 100644
--- a/src/hotspot/cpu/ppc/runtime_ppc.cpp
+++ b/src/hotspot/cpu/ppc/runtime_ppc.cpp
@@ -73,6 +73,9 @@ ExceptionBlob* OptoRuntime::generate_exception_blob() {
// Setup code generation tools.
const char* name = OptoRuntime::stub_name(OptoStubId::exception_id);
CodeBuffer buffer(name, 2048, 1024);
+ if (buffer.blob() == nullptr) {
+ return nullptr;
+ }
InterpreterMacroAssembler* masm = new InterpreterMacroAssembler(&buffer);
address start = __ pc();
diff --git a/src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp b/src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp
index 0d0e004c92383..5a33a14f79e0b 100644
--- a/src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp
+++ b/src/hotspot/cpu/ppc/sharedRuntime_ppc.cpp
@@ -1143,12 +1143,12 @@ void SharedRuntime::gen_i2c_adapter(MacroAssembler *masm,
__ bctr();
}
-AdapterHandlerEntry* SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm,
- int total_args_passed,
- int comp_args_on_stack,
- const BasicType *sig_bt,
- const VMRegPair *regs,
- AdapterFingerPrint* fingerprint) {
+void SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm,
+ int total_args_passed,
+ int comp_args_on_stack,
+ const BasicType *sig_bt,
+ const VMRegPair *regs,
+ AdapterHandlerEntry* handler) {
address i2c_entry;
address c2i_unverified_entry;
address c2i_entry;
@@ -1223,8 +1223,8 @@ AdapterHandlerEntry* SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm
gen_c2i_adapter(masm, total_args_passed, comp_args_on_stack, sig_bt, regs, call_interpreter, ientry);
- return AdapterHandlerLibrary::new_entry(fingerprint, i2c_entry, c2i_entry, c2i_unverified_entry,
- c2i_no_clinit_check_entry);
+ handler->set_entry_points(i2c_entry, c2i_entry, c2i_unverified_entry, c2i_no_clinit_check_entry);
+ return;
}
// An oop arg. Must pass a handle not the oop itself.
@@ -3106,6 +3106,9 @@ UncommonTrapBlob* OptoRuntime::generate_uncommon_trap_blob() {
// Setup code generation tools.
const char* name = OptoRuntime::stub_name(OptoStubId::uncommon_trap_id);
CodeBuffer buffer(name, 2048, 1024);
+ if (buffer.blob() == nullptr) {
+ return nullptr;
+ }
InterpreterMacroAssembler* masm = new InterpreterMacroAssembler(&buffer);
address start = __ pc();
@@ -3404,7 +3407,7 @@ RuntimeStub* SharedRuntime::generate_resolve_blob(SharedStubId id, address desti
RegisterSaver::restore_live_registers_and_pop_frame(masm, frame_size_in_bytes, /*restore_ctr*/ false);
// Get the returned method.
- __ get_vm_result_2(R19_method);
+ __ get_vm_result_metadata(R19_method);
__ bctr();
@@ -3418,7 +3421,7 @@ RuntimeStub* SharedRuntime::generate_resolve_blob(SharedStubId id, address desti
__ li(R11_scratch1, 0);
__ ld(R3_ARG1, thread_(pending_exception));
- __ std(R11_scratch1, in_bytes(JavaThread::vm_result_offset()), R16_thread);
+ __ std(R11_scratch1, in_bytes(JavaThread::vm_result_oop_offset()), R16_thread);
__ b64_patchable(StubRoutines::forward_exception_entry(), relocInfo::runtime_call_type);
// -------------
diff --git a/src/hotspot/cpu/ppc/stubGenerator_ppc.cpp b/src/hotspot/cpu/ppc/stubGenerator_ppc.cpp
index fa356ec13ac16..4a0ced42ed4e8 100644
--- a/src/hotspot/cpu/ppc/stubGenerator_ppc.cpp
+++ b/src/hotspot/cpu/ppc/stubGenerator_ppc.cpp
@@ -546,6 +546,177 @@ class StubGenerator: public StubCodeGenerator {
return start;
}
+ // Computes the Galois/Counter Mode (GCM) product and reduction.
+ //
+ // This function performs polynomial multiplication of the subkey H with
+ // the current GHASH state using vectorized polynomial multiplication (`vpmsumd`).
+ // The subkey H is divided into lower, middle, and higher halves.
+ // The multiplication results are reduced using `vConstC2` to stay within GF(2^128).
+ // The final computed value is stored back into `vState`.
+ static void computeGCMProduct(MacroAssembler* _masm,
+ VectorRegister vLowerH, VectorRegister vH, VectorRegister vHigherH,
+ VectorRegister vConstC2, VectorRegister vZero, VectorRegister vState,
+ VectorRegister vLowProduct, VectorRegister vMidProduct, VectorRegister vHighProduct,
+ VectorRegister vReducedLow, VectorRegister vTmp8, VectorRegister vTmp9,
+ VectorRegister vCombinedResult, VectorRegister vSwappedH) {
+ __ vxor(vH, vH, vState);
+ __ vpmsumd(vLowProduct, vLowerH, vH); // L : Lower Half of subkey H
+ __ vpmsumd(vMidProduct, vSwappedH, vH); // M : Combined halves of subkey H
+ __ vpmsumd(vHighProduct, vHigherH, vH); // H : Higher Half of subkey H
+ __ vpmsumd(vReducedLow, vLowProduct, vConstC2); // Reduction
+ __ vsldoi(vTmp8, vMidProduct, vZero, 8); // mL : Extract the lower 64 bits of M
+ __ vsldoi(vTmp9, vZero, vMidProduct, 8); // mH : Extract the higher 64 bits of M
+ __ vxor(vLowProduct, vLowProduct, vTmp8); // LL + mL : Partial result for lower half
+ __ vxor(vHighProduct, vHighProduct, vTmp9); // HH + mH : Partial result for upper half
+ __ vsldoi(vLowProduct, vLowProduct, vLowProduct, 8); // Swap
+ __ vxor(vLowProduct, vLowProduct, vReducedLow);
+ __ vsldoi(vCombinedResult, vLowProduct, vLowProduct, 8); // Swap
+ __ vpmsumd(vLowProduct, vLowProduct, vConstC2); // Reduction using constant
+ __ vxor(vCombinedResult, vCombinedResult, vHighProduct); // Combine reduced Low & High products
+ __ vxor(vState, vLowProduct, vCombinedResult);
+ }
+
+ // Generate stub for ghash process blocks.
+ //
+ // Arguments for generated stub:
+ // state: R3_ARG1 (long[] state)
+ // subkeyH: R4_ARG2 (long[] subH)
+ // data: R5_ARG3 (byte[] data)
+ // blocks: R6_ARG4 (number of 16-byte blocks to process)
+ //
+ // The polynomials are processed in bit-reflected order for efficiency reasons.
+ // This optimization leverages the structure of the Galois field arithmetic
+ // to minimize the number of bit manipulations required during multiplication.
+ // For an explanation of how this works, refer :
+ // Vinodh Gopal, Erdinc Ozturk, Wajdi Feghali, Jim Guilford, Gil Wolrich,
+ // Martin Dixon. "Optimized Galois-Counter-Mode Implementation on Intel®
+ // Architecture Processor"
+ // http://web.archive.org/web/20130609111954/http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/communications-ia-galois-counter-mode-paper.pdf
+ //
+ //
+ address generate_ghash_processBlocks() {
+ StubCodeMark mark(this, "StubRoutines", "ghash");
+ address start = __ function_entry();
+
+ // Registers for parameters
+ Register state = R3_ARG1; // long[] state
+ Register subkeyH = R4_ARG2; // long[] subH
+ Register data = R5_ARG3; // byte[] data
+ Register blocks = R6_ARG4;
+ Register temp1 = R8;
+ // Vector Registers
+ VectorRegister vZero = VR0;
+ VectorRegister vH = VR1;
+ VectorRegister vLowerH = VR2;
+ VectorRegister vHigherH = VR3;
+ VectorRegister vLowProduct = VR4;
+ VectorRegister vMidProduct = VR5;
+ VectorRegister vHighProduct = VR6;
+ VectorRegister vReducedLow = VR7;
+ VectorRegister vTmp8 = VR8;
+ VectorRegister vTmp9 = VR9;
+ VectorRegister vTmp10 = VR10;
+ VectorRegister vSwappedH = VR11;
+ VectorRegister vTmp12 = VR12;
+ VectorRegister loadOrder = VR13;
+ VectorRegister vHigh = VR14;
+ VectorRegister vLow = VR15;
+ VectorRegister vState = VR16;
+ VectorRegister vPerm = VR17;
+ VectorRegister vCombinedResult = VR18;
+ VectorRegister vConstC2 = VR19;
+
+ __ li(temp1, 0xc2);
+ __ sldi(temp1, temp1, 56);
+ __ vspltisb(vZero, 0);
+ __ mtvrd(vConstC2, temp1);
+ __ lxvd2x(vH->to_vsr(), subkeyH);
+ __ lxvd2x(vState->to_vsr(), state);
+ // Operations to obtain lower and higher bytes of subkey H.
+ __ vspltisb(vReducedLow, 1);
+ __ vspltisb(vTmp10, 7);
+ __ vsldoi(vTmp8, vZero, vReducedLow, 1); // 0x1
+ __ vor(vTmp8, vConstC2, vTmp8); // 0xC2...1
+ __ vsplt(vTmp9, 0, vH); // MSB of H
+ __ vsl(vH, vH, vReducedLow); // Carry = H<<7
+ __ vsrab(vTmp9, vTmp9, vTmp10);
+ __ vand(vTmp9, vTmp9, vTmp8); // Carry
+ __ vxor(vTmp10, vH, vTmp9);
+ __ vsldoi(vConstC2, vZero, vConstC2, 8);
+ __ vsldoi(vSwappedH, vTmp10, vTmp10, 8); // swap Lower and Higher Halves of subkey H
+ __ vsldoi(vLowerH, vZero, vSwappedH, 8); // H.L
+ __ vsldoi(vHigherH, vSwappedH, vZero, 8); // H.H
+#ifdef ASSERT
+ __ cmpwi(CR0, blocks, 0); // Compare 'blocks' (R6_ARG4) with zero
+ __ asm_assert_ne("blocks should NOT be zero");
+#endif
+ __ clrldi(blocks, blocks, 32);
+ __ mtctr(blocks);
+ __ lvsl(loadOrder, temp1);
+#ifdef VM_LITTLE_ENDIAN
+ __ vspltisb(vTmp12, 0xf);
+ __ vxor(loadOrder, loadOrder, vTmp12);
+#define LE_swap_bytes(x) __ vec_perm(x, x, x, loadOrder)
+#else
+#define LE_swap_bytes(x)
+#endif
+
+ // This code performs Karatsuba multiplication in Galois fields to compute the GHASH operation.
+ //
+ // The Karatsuba method breaks the multiplication of two 128-bit numbers into smaller parts,
+ // performing three 128-bit multiplications and combining the results efficiently.
+ //
+ // (C1:C0) = A1*B1, (D1:D0) = A0*B0, (E1:E0) = (A0+A1)(B0+B1)
+ // (A1:A0)(B1:B0) = C1:(C0+C1+D1+E1):(D1+C0+D0+E0):D0
+ //
+ // Inputs:
+ // - vH: The data vector (state), containing both B0 (lower half) and B1 (higher half).
+ // - vLowerH: Lower half of the subkey H (A0).
+ // - vHigherH: Higher half of the subkey H (A1).
+ // - vConstC2: Constant used for reduction (for final processing).
+ //
+ // References:
+ // Shay Gueron, Michael E. Kounavis.
+ // "Intel® Carry-Less Multiplication Instruction and its Usage for Computing the GCM Mode"
+ // https://web.archive.org/web/20110609115824/https://software.intel.com/file/24918
+ //
+ Label L_aligned_loop, L_store, L_unaligned_loop, L_initialize_unaligned_loop;
+ __ andi(temp1, data, 15);
+ __ cmpwi(CR0, temp1, 0);
+ __ bne(CR0, L_initialize_unaligned_loop);
+
+ __ bind(L_aligned_loop);
+ __ lvx(vH, temp1, data);
+ LE_swap_bytes(vH);
+ computeGCMProduct(_masm, vLowerH, vH, vHigherH, vConstC2, vZero, vState,
+ vLowProduct, vMidProduct, vHighProduct, vReducedLow, vTmp8, vTmp9, vCombinedResult, vSwappedH);
+ __ addi(data, data, 16);
+ __ bdnz(L_aligned_loop);
+ __ b(L_store);
+
+ __ bind(L_initialize_unaligned_loop);
+ __ li(temp1, 0);
+ __ lvsl(vPerm, temp1, data);
+ __ lvx(vHigh, temp1, data);
+#ifdef VM_LITTLE_ENDIAN
+ __ vspltisb(vTmp12, -1);
+ __ vxor(vPerm, vPerm, vTmp12);
+#endif
+ __ bind(L_unaligned_loop);
+ __ addi(data, data, 16);
+ __ lvx(vLow, temp1, data);
+ __ vec_perm(vH, vHigh, vLow, vPerm);
+ computeGCMProduct(_masm, vLowerH, vH, vHigherH, vConstC2, vZero, vState,
+ vLowProduct, vMidProduct, vHighProduct, vReducedLow, vTmp8, vTmp9, vCombinedResult, vSwappedH);
+ __ vmr(vHigh, vLow);
+ __ bdnz(L_unaligned_loop);
+
+ __ bind(L_store);
+ __ stxvd2x(vState->to_vsr(), state);
+ __ blr();
+
+ return start;
+ }
// -XX:+OptimizeFill : convert fill/copy loops into intrinsic
//
// The code is implemented(ported from sparc) as we believe it benefits JVM98, however
@@ -2383,6 +2554,105 @@ class StubGenerator: public StubCodeGenerator {
}
+ // Helper for generate_unsafe_setmemory
+ //
+ // Atomically fill an array of memory using 1-, 2-, 4-, or 8-byte chunks and return.
+ static void do_setmemory_atomic_loop(int elem_size, Register dest, Register size, Register byteVal,
+ MacroAssembler *_masm) {
+
+ Label L_Loop, L_Tail; // 2x unrolled loop
+
+ // Propagate byte to required width
+ if (elem_size > 1) __ rldimi(byteVal, byteVal, 8, 64 - 2 * 8);
+ if (elem_size > 2) __ rldimi(byteVal, byteVal, 16, 64 - 2 * 16);
+ if (elem_size > 4) __ rldimi(byteVal, byteVal, 32, 64 - 2 * 32);
+
+ __ srwi_(R0, size, exact_log2(2 * elem_size)); // size is a 32 bit value
+ __ beq(CR0, L_Tail);
+ __ mtctr(R0);
+
+ __ align(32); // loop alignment
+ __ bind(L_Loop);
+ __ store_sized_value(byteVal, 0, dest, elem_size);
+ __ store_sized_value(byteVal, elem_size, dest, elem_size);
+ __ addi(dest, dest, 2 * elem_size);
+ __ bdnz(L_Loop);
+
+ __ bind(L_Tail);
+ __ andi_(R0, size, elem_size);
+ __ bclr(Assembler::bcondCRbiIs1, Assembler::bi0(CR0, Assembler::equal), Assembler::bhintbhBCLRisReturn);
+ __ store_sized_value(byteVal, 0, dest, elem_size);
+ __ blr();
+ }
+
+ //
+ // Generate 'unsafe' set memory stub
+ // Though just as safe as the other stubs, it takes an unscaled
+ // size_t (# bytes) argument instead of an element count.
+ //
+ // Input:
+ // R3_ARG1 - destination array address
+ // R4_ARG2 - byte count (size_t)
+ // R5_ARG3 - byte value
+ //
+ address generate_unsafe_setmemory(address unsafe_byte_fill) {
+ __ align(CodeEntryAlignment);
+ StubCodeMark mark(this, StubGenStubId::unsafe_setmemory_id);
+ address start = __ function_entry();
+
+ // bump this on entry, not on exit:
+ // inc_counter_np(SharedRuntime::_unsafe_set_memory_ctr);
+
+ {
+ Label L_fill8Bytes, L_fill4Bytes, L_fillBytes;
+
+ const Register dest = R3_ARG1;
+ const Register size = R4_ARG2;
+ const Register byteVal = R5_ARG3;
+ const Register rScratch1 = R6;
+
+ // fill_to_memory_atomic(unsigned char*, unsigned long, unsigned char)
+
+ // Check for pointer & size alignment
+ __ orr(rScratch1, dest, size);
+
+ __ andi_(R0, rScratch1, 7);
+ __ beq(CR0, L_fill8Bytes);
+
+ __ andi_(R0, rScratch1, 3);
+ __ beq(CR0, L_fill4Bytes);
+
+ __ andi_(R0, rScratch1, 1);
+ __ bne(CR0, L_fillBytes);
+
+ // Mark remaining code as such which performs Unsafe accesses.
+ UnsafeMemoryAccessMark umam(this, true, false);
+
+ // At this point, we know the lower bit of size is zero and a
+ // multiple of 2
+ do_setmemory_atomic_loop(2, dest, size, byteVal, _masm);
+
+ __ align(32);
+ __ bind(L_fill8Bytes);
+ // At this point, we know the lower 3 bits of size are zero and a
+ // multiple of 8
+ do_setmemory_atomic_loop(8, dest, size, byteVal, _masm);
+
+ __ align(32);
+ __ bind(L_fill4Bytes);
+ // At this point, we know the lower 2 bits of size are zero and a
+ // multiple of 4
+ do_setmemory_atomic_loop(4, dest, size, byteVal, _masm);
+
+ __ align(32);
+ __ bind(L_fillBytes);
+ do_setmemory_atomic_loop(1, dest, size, byteVal, _masm);
+ }
+
+ return start;
+ }
+
+
//
// Generate generic array copy stubs
//
@@ -3207,6 +3477,7 @@ class StubGenerator: public StubCodeGenerator {
StubRoutines::_arrayof_jshort_fill = generate_fill(StubGenStubId::arrayof_jshort_fill_id);
StubRoutines::_arrayof_jint_fill = generate_fill(StubGenStubId::arrayof_jint_fill_id);
}
+ StubRoutines::_unsafe_setmemory = generate_unsafe_setmemory(StubRoutines::_jbyte_fill);
#endif
}
@@ -4928,6 +5199,10 @@ void generate_lookup_secondary_supers_table_stub() {
StubRoutines::_data_cache_writeback_sync = generate_data_cache_writeback_sync();
}
+ if (UseGHASHIntrinsics) {
+ StubRoutines::_ghash_processBlocks = generate_ghash_processBlocks();
+ }
+
if (UseAESIntrinsics) {
StubRoutines::_aescrypt_encryptBlock = generate_aescrypt_encryptBlock();
StubRoutines::_aescrypt_decryptBlock = generate_aescrypt_decryptBlock();
diff --git a/src/hotspot/cpu/ppc/templateInterpreterGenerator_ppc.cpp b/src/hotspot/cpu/ppc/templateInterpreterGenerator_ppc.cpp
index b1ad3cd48bcc6..a8f5dbda484d6 100644
--- a/src/hotspot/cpu/ppc/templateInterpreterGenerator_ppc.cpp
+++ b/src/hotspot/cpu/ppc/templateInterpreterGenerator_ppc.cpp
@@ -2160,12 +2160,12 @@ void TemplateInterpreterGenerator::generate_throw_exception() {
{
__ pop_ptr(Rexception);
__ verify_oop(Rexception);
- __ std(Rexception, in_bytes(JavaThread::vm_result_offset()), R16_thread);
+ __ std(Rexception, in_bytes(JavaThread::vm_result_oop_offset()), R16_thread);
__ unlock_if_synchronized_method(vtos, /* throw_monitor_exception */ false, true);
__ notify_method_exit(false, vtos, InterpreterMacroAssembler::SkipNotifyJVMTI, false);
- __ get_vm_result(Rexception);
+ __ get_vm_result_oop(Rexception);
// We are done with this activation frame; find out where to go next.
// The continuation point will be an exception handler, which expects
diff --git a/src/hotspot/cpu/ppc/templateTable_ppc_64.cpp b/src/hotspot/cpu/ppc/templateTable_ppc_64.cpp
index 934bb1bd52918..8be6080e3d1c0 100644
--- a/src/hotspot/cpu/ppc/templateTable_ppc_64.cpp
+++ b/src/hotspot/cpu/ppc/templateTable_ppc_64.cpp
@@ -386,7 +386,7 @@ void TemplateTable::condy_helper(Label& Done) {
const Register rarg = R4_ARG2;
__ li(rarg, (int)bytecode());
call_VM(obj, CAST_FROM_FN_PTR(address, InterpreterRuntime::resolve_ldc), rarg);
- __ get_vm_result_2(flags);
+ __ get_vm_result_metadata(flags);
// VMr = obj = base address to find primitive value to push
// VMr2 = flags = (tos, off) using format of CPCE::_flags
@@ -3964,7 +3964,7 @@ void TemplateTable::checkcast() {
// Call into the VM to "quicken" instanceof.
__ push_ptr(); // for GC
call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::quicken_io_cc));
- __ get_vm_result_2(RspecifiedKlass);
+ __ get_vm_result_metadata(RspecifiedKlass);
__ pop_ptr(); // Restore receiver.
__ b(Lresolved);
@@ -4026,7 +4026,7 @@ void TemplateTable::instanceof() {
// Call into the VM to "quicken" instanceof.
__ push_ptr(); // for GC
call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::quicken_io_cc));
- __ get_vm_result_2(RspecifiedKlass);
+ __ get_vm_result_metadata(RspecifiedKlass);
__ pop_ptr(); // Restore receiver.
__ b(Lresolved);
diff --git a/src/hotspot/cpu/ppc/vm_version_ppc.cpp b/src/hotspot/cpu/ppc/vm_version_ppc.cpp
index 8ec69bffe15ea..c8c53543d14ac 100644
--- a/src/hotspot/cpu/ppc/vm_version_ppc.cpp
+++ b/src/hotspot/cpu/ppc/vm_version_ppc.cpp
@@ -219,7 +219,7 @@ void VM_Version::initialize() {
(has_brw() ? " brw" : "")
// Make sure number of %s matches num_features!
);
- _features_string = os::strdup(buf);
+ _cpu_info_string = os::strdup(buf);
if (Verbose) {
print_features();
}
@@ -308,8 +308,14 @@ void VM_Version::initialize() {
FLAG_SET_DEFAULT(UseAESCTRIntrinsics, false);
}
- if (UseGHASHIntrinsics) {
- warning("GHASH intrinsics are not available on this CPU");
+ if (VM_Version::has_vsx()) {
+ if (FLAG_IS_DEFAULT(UseGHASHIntrinsics)) {
+ UseGHASHIntrinsics = true;
+ }
+ } else if (UseGHASHIntrinsics) {
+ if (!FLAG_IS_DEFAULT(UseGHASHIntrinsics)) {
+ warning("GHASH intrinsics are not available on this CPU");
+ }
FLAG_SET_DEFAULT(UseGHASHIntrinsics, false);
}
@@ -519,7 +525,7 @@ void VM_Version::print_platform_virtualization_info(outputStream* st) {
}
void VM_Version::print_features() {
- tty->print_cr("Version: %s L1_data_cache_line_size=%d", features_string(), L1_data_cache_line_size());
+ tty->print_cr("Version: %s L1_data_cache_line_size=%d", cpu_info_string(), L1_data_cache_line_size());
if (Verbose) {
if (ContendedPaddingWidth > 0) {
@@ -726,6 +732,6 @@ void VM_Version::initialize_cpu_information(void) {
_no_of_threads = _no_of_cores;
_no_of_sockets = _no_of_cores;
snprintf(_cpu_name, CPU_TYPE_DESC_BUF_SIZE, "PowerPC POWER%lu", PowerArchitecturePPC64);
- snprintf(_cpu_desc, CPU_DETAILED_DESC_BUF_SIZE, "PPC %s", features_string());
+ snprintf(_cpu_desc, CPU_DETAILED_DESC_BUF_SIZE, "PPC %s", cpu_info_string());
_initialized = true;
}
diff --git a/src/hotspot/cpu/riscv/assembler_riscv.hpp b/src/hotspot/cpu/riscv/assembler_riscv.hpp
index 3c19673f1e749..63d5d4f3e613d 100644
--- a/src/hotspot/cpu/riscv/assembler_riscv.hpp
+++ b/src/hotspot/cpu/riscv/assembler_riscv.hpp
@@ -815,7 +815,7 @@ class Assembler : public AbstractAssembler {
emit(insn);
}
- public:
+ protected:
enum barrier {
i = 0b1000, o = 0b0100, r = 0b0010, w = 0b0001,
@@ -846,6 +846,8 @@ class Assembler : public AbstractAssembler {
emit(insn);
}
+ public:
+
#define INSN(NAME, op, funct3, funct7) \
void NAME() { \
unsigned insn = 0; \
@@ -1902,8 +1904,14 @@ enum VectorMask {
INSN(vand_vv, 0b1010111, 0b000, 0b001001);
// Vector Single-Width Integer Add and Subtract
- INSN(vsub_vv, 0b1010111, 0b000, 0b000010);
INSN(vadd_vv, 0b1010111, 0b000, 0b000000);
+ INSN(vsub_vv, 0b1010111, 0b000, 0b000010);
+
+ // Vector Saturating Integer Add and Subtract
+ INSN(vsadd_vv, 0b1010111, 0b000, 0b100001);
+ INSN(vsaddu_vv, 0b1010111, 0b000, 0b100000);
+ INSN(vssub_vv, 0b1010111, 0b000, 0b100011);
+ INSN(vssubu_vv, 0b1010111, 0b000, 0b100010);
// Vector Register Gather Instructions
INSN(vrgather_vv, 0b1010111, 0b000, 0b001100);
@@ -2321,6 +2329,7 @@ enum Nf {
}
// Vector Bit-manipulation used in Cryptography (Zvbb) Extension
+ INSN(vandn_vx, 0b1010111, 0b100, 0b000001);
INSN(vrol_vx, 0b1010111, 0b100, 0b010101);
INSN(vror_vx, 0b1010111, 0b100, 0b010100);
diff --git a/src/hotspot/cpu/riscv/c1_CodeStubs_riscv.cpp b/src/hotspot/cpu/riscv/c1_CodeStubs_riscv.cpp
index b9bd7b356fa6b..ea299181ca7af 100644
--- a/src/hotspot/cpu/riscv/c1_CodeStubs_riscv.cpp
+++ b/src/hotspot/cpu/riscv/c1_CodeStubs_riscv.cpp
@@ -70,7 +70,7 @@ void RangeCheckStub::emit_code(LIR_Assembler* ce) {
__ far_call(RuntimeAddress(a));
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
return;
}
@@ -87,12 +87,12 @@ void RangeCheckStub::emit_code(LIR_Assembler* ce) {
__ mv(t1, _array->as_pointer_register());
stub_id = C1StubId::throw_range_check_failed_id;
}
- // t0 and t1 are used as args in generate_exception_throw,
+ // t0 and t1 are used as args in generate_exception_throw,
// so use x1/ra as the tmp register for rt_call.
__ rt_call(Runtime1::entry_for(stub_id), ra);
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
}
PredicateFailedStub::PredicateFailedStub(CodeEmitInfo* info) {
@@ -105,7 +105,7 @@ void PredicateFailedStub::emit_code(LIR_Assembler* ce) {
__ far_call(RuntimeAddress(a));
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
}
void DivByZeroStub::emit_code(LIR_Assembler* ce) {
@@ -258,7 +258,7 @@ void ImplicitNullCheckStub::emit_code(LIR_Assembler* ce) {
__ far_call(RuntimeAddress(a));
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
}
void SimpleExceptionStub::emit_code(LIR_Assembler* ce) {
@@ -272,7 +272,7 @@ void SimpleExceptionStub::emit_code(LIR_Assembler* ce) {
}
__ far_call(RuntimeAddress(Runtime1::entry_for(_stub)));
ce->add_call_info_here(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
}
void ArrayCopyStub::emit_code(LIR_Assembler* ce) {
diff --git a/src/hotspot/cpu/riscv/c1_MacroAssembler_riscv.cpp b/src/hotspot/cpu/riscv/c1_MacroAssembler_riscv.cpp
index 76089e8dd4536..29bf3e5f2ed65 100644
--- a/src/hotspot/cpu/riscv/c1_MacroAssembler_riscv.cpp
+++ b/src/hotspot/cpu/riscv/c1_MacroAssembler_riscv.cpp
@@ -61,16 +61,17 @@ int C1_MacroAssembler::lock_object(Register hdr, Register obj, Register disp_hdr
null_check_offset = offset();
- if (DiagnoseSyncOnValueBasedClasses != 0) {
- load_klass(hdr, obj);
- lbu(hdr, Address(hdr, Klass::misc_flags_offset()));
- test_bit(temp, hdr, exact_log2(KlassFlags::_misc_is_value_based_class));
- bnez(temp, slow_case, true /* is_far */);
- }
-
if (LockingMode == LM_LIGHTWEIGHT) {
lightweight_lock(disp_hdr, obj, hdr, temp, t1, slow_case);
} else if (LockingMode == LM_LEGACY) {
+
+ if (DiagnoseSyncOnValueBasedClasses != 0) {
+ load_klass(hdr, obj);
+ lbu(hdr, Address(hdr, Klass::misc_flags_offset()));
+ test_bit(temp, hdr, exact_log2(KlassFlags::_misc_is_value_based_class));
+ bnez(temp, slow_case, /* is_far */ true);
+ }
+
Label done;
// Load object header
ld(hdr, Address(obj, hdr_offset));
diff --git a/src/hotspot/cpu/riscv/c1_Runtime1_riscv.cpp b/src/hotspot/cpu/riscv/c1_Runtime1_riscv.cpp
index 0f1f1dd891c51..849417725a70d 100644
--- a/src/hotspot/cpu/riscv/c1_Runtime1_riscv.cpp
+++ b/src/hotspot/cpu/riscv/c1_Runtime1_riscv.cpp
@@ -89,10 +89,10 @@ int StubAssembler::call_RT(Register oop_result, Register metadata_result, addres
// exception pending => remove activation and forward to exception handler
// make sure that the vm_results are cleared
if (oop_result->is_valid()) {
- sd(zr, Address(xthread, JavaThread::vm_result_offset()));
+ sd(zr, Address(xthread, JavaThread::vm_result_oop_offset()));
}
if (metadata_result->is_valid()) {
- sd(zr, Address(xthread, JavaThread::vm_result_2_offset()));
+ sd(zr, Address(xthread, JavaThread::vm_result_metadata_offset()));
}
if (frame_size() == no_frame_size) {
leave();
@@ -106,10 +106,10 @@ int StubAssembler::call_RT(Register oop_result, Register metadata_result, addres
}
// get oop results if there are any and reset the values in the thread
if (oop_result->is_valid()) {
- get_vm_result(oop_result, xthread);
+ get_vm_result_oop(oop_result, xthread);
}
if (metadata_result->is_valid()) {
- get_vm_result_2(metadata_result, xthread);
+ get_vm_result_metadata(metadata_result, xthread);
}
return call_offset;
}
@@ -427,8 +427,8 @@ OopMapSet* Runtime1::generate_handle_exception(C1StubId id, StubAssembler *sasm)
__ ld(exception_pc, Address(fp, frame::return_addr_offset * BytesPerWord));
// make sure that the vm_results are cleared (may be unnecessary)
- __ sd(zr, Address(xthread, JavaThread::vm_result_offset()));
- __ sd(zr, Address(xthread, JavaThread::vm_result_2_offset()));
+ __ sd(zr, Address(xthread, JavaThread::vm_result_oop_offset()));
+ __ sd(zr, Address(xthread, JavaThread::vm_result_metadata_offset()));
break;
case C1StubId::handle_exception_nofpu_id:
case C1StubId::handle_exception_id:
diff --git a/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp b/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp
index 99cbcedb8ff55..77b4e26cc924b 100644
--- a/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp
+++ b/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp
@@ -289,7 +289,7 @@ void C2_MacroAssembler::fast_lock_lightweight(Register obj, Register box,
Label slow_path;
if (UseObjectMonitorTable) {
- // Clear cache in case fast locking succeeds.
+ // Clear cache in case fast locking succeeds or we need to take the slow-path.
sd(zr, Address(box, BasicLock::object_monitor_cache_offset_in_bytes()));
}
@@ -2156,6 +2156,36 @@ void C2_MacroAssembler::enc_cmove(int cmpFlag, Register op1, Register op2, Regis
}
}
+void C2_MacroAssembler::enc_cmove_cmp_fp(int cmpFlag, FloatRegister op1, FloatRegister op2, Register dst, Register src, bool is_single) {
+ int op_select = cmpFlag & (~unsigned_branch_mask);
+
+ switch (op_select) {
+ case BoolTest::eq:
+ cmov_cmp_fp_eq(op1, op2, dst, src, is_single);
+ break;
+ case BoolTest::ne:
+ cmov_cmp_fp_ne(op1, op2, dst, src, is_single);
+ break;
+ case BoolTest::le:
+ cmov_cmp_fp_le(op1, op2, dst, src, is_single);
+ break;
+ case BoolTest::ge:
+ assert(false, "Should go to BoolTest::le case");
+ ShouldNotReachHere();
+ break;
+ case BoolTest::lt:
+ cmov_cmp_fp_lt(op1, op2, dst, src, is_single);
+ break;
+ case BoolTest::gt:
+ assert(false, "Should go to BoolTest::lt case");
+ ShouldNotReachHere();
+ break;
+ default:
+ assert(false, "unsupported compare condition");
+ ShouldNotReachHere();
+ }
+}
+
// Set dst to NaN if any NaN input.
void C2_MacroAssembler::minmax_fp(FloatRegister dst, FloatRegister src1, FloatRegister src2,
FLOAT_TYPE ft, bool is_min) {
@@ -3080,7 +3110,9 @@ void C2_MacroAssembler::compare_integral_v(VectorRegister vd, VectorRegister src
assert(is_integral_type(bt), "unsupported element type");
assert(vm == Assembler::v0_t ? vd != v0 : true, "should be different registers");
vsetvli_helper(bt, vector_length);
- vmclr_m(vd);
+ if (vm == Assembler::v0_t) {
+ vmclr_m(vd);
+ }
switch (cond) {
case BoolTest::eq: vmseq_vv(vd, src1, src2, vm); break;
case BoolTest::ne: vmsne_vv(vd, src1, src2, vm); break;
@@ -3103,7 +3135,9 @@ void C2_MacroAssembler::compare_fp_v(VectorRegister vd, VectorRegister src1, Vec
assert(is_floating_point_type(bt), "unsupported element type");
assert(vm == Assembler::v0_t ? vd != v0 : true, "should be different registers");
vsetvli_helper(bt, vector_length);
- vmclr_m(vd);
+ if (vm == Assembler::v0_t) {
+ vmclr_m(vd);
+ }
switch (cond) {
case BoolTest::eq: vmfeq_vv(vd, src1, src2, vm); break;
case BoolTest::ne: vmfne_vv(vd, src1, src2, vm); break;
diff --git a/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.hpp b/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.hpp
index a650174d90f08..73fceea38051e 100644
--- a/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.hpp
+++ b/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.hpp
@@ -129,6 +129,10 @@
Register op1, Register op2,
Register dst, Register src);
+ void enc_cmove_cmp_fp(int cmpFlag,
+ FloatRegister op1, FloatRegister op2,
+ Register dst, Register src, bool is_single);
+
void spill(Register r, bool is64, int offset) {
is64 ? sd(r, Address(sp, offset))
: sw(r, Address(sp, offset));
diff --git a/src/hotspot/cpu/riscv/c2_globals_riscv.hpp b/src/hotspot/cpu/riscv/c2_globals_riscv.hpp
index de3c1b17c8eab..79bdc4917c9ed 100644
--- a/src/hotspot/cpu/riscv/c2_globals_riscv.hpp
+++ b/src/hotspot/cpu/riscv/c2_globals_riscv.hpp
@@ -43,7 +43,7 @@ define_pd_global(bool, TieredCompilation, COMPILER1_PRESENT(true) NOT
define_pd_global(intx, CompileThreshold, 10000);
define_pd_global(intx, OnStackReplacePercentage, 140);
-define_pd_global(intx, ConditionalMoveLimit, 0);
+define_pd_global(intx, ConditionalMoveLimit, 3);
define_pd_global(intx, FreqInlineSize, 325);
define_pd_global(intx, MinJumpTableSize, 10);
define_pd_global(intx, InteriorEntryAlignment, 16);
diff --git a/src/hotspot/cpu/riscv/gc/shared/barrierSetAssembler_riscv.cpp b/src/hotspot/cpu/riscv/gc/shared/barrierSetAssembler_riscv.cpp
index d66a86c750a26..5b3c926cfa96b 100644
--- a/src/hotspot/cpu/riscv/gc/shared/barrierSetAssembler_riscv.cpp
+++ b/src/hotspot/cpu/riscv/gc/shared/barrierSetAssembler_riscv.cpp
@@ -275,7 +275,7 @@ void BarrierSetAssembler::nmethod_entry_barrier(MacroAssembler* masm, Label* slo
// order, while allowing other independent instructions to be reordered.
// Note: This may be slower than using a membar(load|load) (fence r,r).
// Because processors will not start the second load until the first comes back.
- // This means you can’t overlap the two loads,
+ // This means you can't overlap the two loads,
// which is stronger than needed for ordering (stronger than TSO).
__ srli(ra, t0, 32);
__ orr(t1, t1, ra);
diff --git a/src/hotspot/cpu/riscv/gc/shared/barrierSetNMethod_riscv.cpp b/src/hotspot/cpu/riscv/gc/shared/barrierSetNMethod_riscv.cpp
index 39da77181c674..f24e4f789bc50 100644
--- a/src/hotspot/cpu/riscv/gc/shared/barrierSetNMethod_riscv.cpp
+++ b/src/hotspot/cpu/riscv/gc/shared/barrierSetNMethod_riscv.cpp
@@ -31,8 +31,8 @@
#include "memory/resourceArea.hpp"
#include "runtime/frame.inline.hpp"
#include "runtime/javaThread.hpp"
-#include "runtime/sharedRuntime.hpp"
#include "runtime/registerMap.hpp"
+#include "runtime/sharedRuntime.hpp"
#include "utilities/align.hpp"
#include "utilities/debug.hpp"
#if INCLUDE_JVMCI
diff --git a/src/hotspot/cpu/riscv/gc/shenandoah/c1/shenandoahBarrierSetC1_riscv.cpp b/src/hotspot/cpu/riscv/gc/shenandoah/c1/shenandoahBarrierSetC1_riscv.cpp
index 2a96bd32cf8d7..11c4e5dc81b6c 100644
--- a/src/hotspot/cpu/riscv/gc/shenandoah/c1/shenandoahBarrierSetC1_riscv.cpp
+++ b/src/hotspot/cpu/riscv/gc/shenandoah/c1/shenandoahBarrierSetC1_riscv.cpp
@@ -26,9 +26,9 @@
#include "c1/c1_LIRAssembler.hpp"
#include "c1/c1_MacroAssembler.hpp"
#include "gc/shared/gc_globals.hpp"
+#include "gc/shenandoah/c1/shenandoahBarrierSetC1.hpp"
#include "gc/shenandoah/shenandoahBarrierSet.hpp"
#include "gc/shenandoah/shenandoahBarrierSetAssembler.hpp"
-#include "gc/shenandoah/c1/shenandoahBarrierSetC1.hpp"
#define __ masm->masm()->
diff --git a/src/hotspot/cpu/riscv/gc/shenandoah/shenandoahBarrierSetAssembler_riscv.cpp b/src/hotspot/cpu/riscv/gc/shenandoah/shenandoahBarrierSetAssembler_riscv.cpp
index 3021351cca84f..4c1056e75a551 100644
--- a/src/hotspot/cpu/riscv/gc/shenandoah/shenandoahBarrierSetAssembler_riscv.cpp
+++ b/src/hotspot/cpu/riscv/gc/shenandoah/shenandoahBarrierSetAssembler_riscv.cpp
@@ -23,6 +23,8 @@
*
*/
+#include "gc/shenandoah/heuristics/shenandoahHeuristics.hpp"
+#include "gc/shenandoah/mode/shenandoahMode.hpp"
#include "gc/shenandoah/shenandoahBarrierSet.hpp"
#include "gc/shenandoah/shenandoahBarrierSetAssembler.hpp"
#include "gc/shenandoah/shenandoahForwarding.hpp"
@@ -30,10 +32,8 @@
#include "gc/shenandoah/shenandoahHeapRegion.hpp"
#include "gc/shenandoah/shenandoahRuntime.hpp"
#include "gc/shenandoah/shenandoahThreadLocalData.hpp"
-#include "gc/shenandoah/heuristics/shenandoahHeuristics.hpp"
-#include "gc/shenandoah/mode/shenandoahMode.hpp"
-#include "interpreter/interpreter.hpp"
#include "interpreter/interp_masm.hpp"
+#include "interpreter/interpreter.hpp"
#include "runtime/javaThread.hpp"
#include "runtime/sharedRuntime.hpp"
#ifdef COMPILER1
diff --git a/src/hotspot/cpu/riscv/gc/z/zAddress_riscv.cpp b/src/hotspot/cpu/riscv/gc/z/zAddress_riscv.cpp
index 683d892915f50..5f783e6fb8ba5 100644
--- a/src/hotspot/cpu/riscv/gc/z/zAddress_riscv.cpp
+++ b/src/hotspot/cpu/riscv/gc/z/zAddress_riscv.cpp
@@ -22,8 +22,8 @@
* questions.
*/
-#include "gc/shared/gcLogPrecious.hpp"
#include "gc/shared/gc_globals.hpp"
+#include "gc/shared/gcLogPrecious.hpp"
#include "gc/z/zAddress.hpp"
#include "gc/z/zBarrierSetAssembler.hpp"
#include "gc/z/zGlobals.hpp"
@@ -94,7 +94,7 @@ size_t ZPlatformAddressOffsetBits() {
static const size_t valid_max_address_offset_bits = probe_valid_max_address_bit() + 1;
const size_t max_address_offset_bits = valid_max_address_offset_bits - 3;
const size_t min_address_offset_bits = max_address_offset_bits - 2;
- const size_t address_offset = round_up_power_of_2(MaxHeapSize * ZVirtualToPhysicalRatio);
+ const size_t address_offset = ZGlobalsPointers::min_address_offset_request();
const size_t address_offset_bits = log2i_exact(address_offset);
return clamp(address_offset_bits, min_address_offset_bits, max_address_offset_bits);
}
diff --git a/src/hotspot/cpu/riscv/interp_masm_riscv.cpp b/src/hotspot/cpu/riscv/interp_masm_riscv.cpp
index f1f9414d98a11..8be5408cb2b88 100644
--- a/src/hotspot/cpu/riscv/interp_masm_riscv.cpp
+++ b/src/hotspot/cpu/riscv/interp_masm_riscv.cpp
@@ -736,17 +736,18 @@ void InterpreterMacroAssembler::lock_object(Register lock_reg)
// Load object pointer into obj_reg c_rarg3
ld(obj_reg, Address(lock_reg, obj_offset));
- if (DiagnoseSyncOnValueBasedClasses != 0) {
- load_klass(tmp, obj_reg);
- lbu(tmp, Address(tmp, Klass::misc_flags_offset()));
- test_bit(tmp, tmp, exact_log2(KlassFlags::_misc_is_value_based_class));
- bnez(tmp, slow_case);
- }
-
if (LockingMode == LM_LIGHTWEIGHT) {
lightweight_lock(lock_reg, obj_reg, tmp, tmp2, tmp3, slow_case);
j(done);
} else if (LockingMode == LM_LEGACY) {
+
+ if (DiagnoseSyncOnValueBasedClasses != 0) {
+ load_klass(tmp, obj_reg);
+ lbu(tmp, Address(tmp, Klass::misc_flags_offset()));
+ test_bit(tmp, tmp, exact_log2(KlassFlags::_misc_is_value_based_class));
+ bnez(tmp, slow_case);
+ }
+
// Load (object->mark() | 1) into swap_reg
ld(t0, Address(obj_reg, oopDesc::mark_offset_in_bytes()));
ori(swap_reg, t0, 1);
diff --git a/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp b/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp
index 1916fbdeb18b3..825722ad29bc0 100644
--- a/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp
+++ b/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp
@@ -183,7 +183,6 @@ void MacroAssembler::set_membar_kind(address addr, uint32_t order_kind) {
Assembler::sd_instr(membar, insn);
}
-
static void pass_arg0(MacroAssembler* masm, Register arg) {
if (c_rarg0 != arg) {
masm->mv(c_rarg0, arg);
@@ -499,19 +498,19 @@ void MacroAssembler::call_VM_base(Register oop_result,
// get oop result if there is one and reset the value in the thread
if (oop_result->is_valid()) {
- get_vm_result(oop_result, java_thread);
+ get_vm_result_oop(oop_result, java_thread);
}
}
-void MacroAssembler::get_vm_result(Register oop_result, Register java_thread) {
- ld(oop_result, Address(java_thread, JavaThread::vm_result_offset()));
- sd(zr, Address(java_thread, JavaThread::vm_result_offset()));
+void MacroAssembler::get_vm_result_oop(Register oop_result, Register java_thread) {
+ ld(oop_result, Address(java_thread, JavaThread::vm_result_oop_offset()));
+ sd(zr, Address(java_thread, JavaThread::vm_result_oop_offset()));
verify_oop_msg(oop_result, "broken oop in call_VM_base");
}
-void MacroAssembler::get_vm_result_2(Register metadata_result, Register java_thread) {
- ld(metadata_result, Address(java_thread, JavaThread::vm_result_2_offset()));
- sd(zr, Address(java_thread, JavaThread::vm_result_2_offset()));
+void MacroAssembler::get_vm_result_metadata(Register metadata_result, Register java_thread) {
+ ld(metadata_result, Address(java_thread, JavaThread::vm_result_metadata_offset()));
+ sd(zr, Address(java_thread, JavaThread::vm_result_metadata_offset()));
}
void MacroAssembler::clinit_barrier(Register klass, Register tmp, Label* L_fast_path, Label* L_slow_path) {
@@ -1268,6 +1267,130 @@ void MacroAssembler::cmov_gtu(Register cmp1, Register cmp2, Register dst, Regist
bind(no_set);
}
+// ----------- cmove, compare float -----------
+
+// Move src to dst only if cmp1 == cmp2,
+// otherwise leave dst unchanged, including the case where one of them is NaN.
+// Clarification:
+// java code : cmp1 != cmp2 ? dst : src
+// transformed to : CMove dst, (cmp1 eq cmp2), dst, src
+void MacroAssembler::cmov_cmp_fp_eq(FloatRegister cmp1, FloatRegister cmp2, Register dst, Register src, bool is_single) {
+ if (UseZicond) {
+ if (is_single) {
+ feq_s(t0, cmp1, cmp2);
+ } else {
+ feq_d(t0, cmp1, cmp2);
+ }
+ czero_nez(dst, dst, t0);
+ czero_eqz(t0 , src, t0);
+ orr(dst, dst, t0);
+ return;
+ }
+ Label no_set;
+ if (is_single) {
+ // jump if cmp1 != cmp2, including the case of NaN
+ // not jump (i.e. move src to dst) if cmp1 == cmp2
+ float_bne(cmp1, cmp2, no_set);
+ } else {
+ double_bne(cmp1, cmp2, no_set);
+ }
+ mv(dst, src);
+ bind(no_set);
+}
+
+// Keep dst unchanged only if cmp1 == cmp2,
+// otherwise move src to dst, including the case where one of them is NaN.
+// Clarification:
+// java code : cmp1 == cmp2 ? dst : src
+// transformed to : CMove dst, (cmp1 ne cmp2), dst, src
+void MacroAssembler::cmov_cmp_fp_ne(FloatRegister cmp1, FloatRegister cmp2, Register dst, Register src, bool is_single) {
+ if (UseZicond) {
+ if (is_single) {
+ feq_s(t0, cmp1, cmp2);
+ } else {
+ feq_d(t0, cmp1, cmp2);
+ }
+ czero_eqz(dst, dst, t0);
+ czero_nez(t0 , src, t0);
+ orr(dst, dst, t0);
+ return;
+ }
+ Label no_set;
+ if (is_single) {
+ // jump if cmp1 == cmp2
+ // not jump (i.e. move src to dst) if cmp1 != cmp2, including the case of NaN
+ float_beq(cmp1, cmp2, no_set);
+ } else {
+ double_beq(cmp1, cmp2, no_set);
+ }
+ mv(dst, src);
+ bind(no_set);
+}
+
+// When cmp1 <= cmp2 or any of them is NaN then dst = src, otherwise, dst = dst
+// Clarification
+// scenario 1:
+// java code : cmp2 < cmp1 ? dst : src
+// transformed to : CMove dst, (cmp1 le cmp2), dst, src
+// scenario 2:
+// java code : cmp1 > cmp2 ? dst : src
+// transformed to : CMove dst, (cmp1 le cmp2), dst, src
+void MacroAssembler::cmov_cmp_fp_le(FloatRegister cmp1, FloatRegister cmp2, Register dst, Register src, bool is_single) {
+ if (UseZicond) {
+ if (is_single) {
+ flt_s(t0, cmp2, cmp1);
+ } else {
+ flt_d(t0, cmp2, cmp1);
+ }
+ czero_eqz(dst, dst, t0);
+ czero_nez(t0 , src, t0);
+ orr(dst, dst, t0);
+ return;
+ }
+ Label no_set;
+ if (is_single) {
+ // jump if cmp1 > cmp2
+ // not jump (i.e. move src to dst) if cmp1 <= cmp2 or either is NaN
+ float_bgt(cmp1, cmp2, no_set);
+ } else {
+ double_bgt(cmp1, cmp2, no_set);
+ }
+ mv(dst, src);
+ bind(no_set);
+}
+
+// When cmp1 < cmp2 or any of them is NaN then dst = src, otherwise, dst = dst
+// Clarification
+// scenario 1:
+// java code : cmp2 <= cmp1 ? dst : src
+// transformed to : CMove dst, (cmp1 lt cmp2), dst, src
+// scenario 2:
+// java code : cmp1 >= cmp2 ? dst : src
+// transformed to : CMove dst, (cmp1 lt cmp2), dst, src
+void MacroAssembler::cmov_cmp_fp_lt(FloatRegister cmp1, FloatRegister cmp2, Register dst, Register src, bool is_single) {
+ if (UseZicond) {
+ if (is_single) {
+ fle_s(t0, cmp2, cmp1);
+ } else {
+ fle_d(t0, cmp2, cmp1);
+ }
+ czero_eqz(dst, dst, t0);
+ czero_nez(t0 , src, t0);
+ orr(dst, dst, t0);
+ return;
+ }
+ Label no_set;
+ if (is_single) {
+ // jump if cmp1 >= cmp2
+ // not jump (i.e. move src to dst) if cmp1 < cmp2 or either is NaN
+ float_bge(cmp1, cmp2, no_set);
+ } else {
+ double_bge(cmp1, cmp2, no_set);
+ }
+ mv(dst, src);
+ bind(no_set);
+}
+
// Float compare branch instructions
#define INSN(NAME, FLOATCMP, BRANCH) \
@@ -1683,7 +1806,7 @@ void MacroAssembler::vector_update_crc32(Register crc, Register buf, Register le
for (int i = 0; i < N; i++) {
vmv_x_s(tmp2, vcrc);
// in vmv_x_s, the value is sign-extended to SEW bits, but we need zero-extended here.
- zext_w(tmp2, tmp2);
+ zext(tmp2, tmp2, 32);
vslidedown_vi(vcrc, vcrc, 1);
xorr(crc, crc, tmp2);
for (int j = 0; j < W; j++) {
@@ -3556,6 +3679,14 @@ void MacroAssembler::lookup_virtual_method(Register recv_klass,
}
void MacroAssembler::membar(uint32_t order_constraint) {
+ if (UseZtso && ((order_constraint & StoreLoad) != StoreLoad)) {
+ // TSO allows for stores to be reordered after loads. When the compiler
+ // generates a fence to disallow that, we are required to generate the
+ // fence for correctness.
+ BLOCK_COMMENT("elided tso membar");
+ return;
+ }
+
address prev = pc() - MacroAssembler::instruction_size;
address last = code()->last_insn();
@@ -3564,15 +3695,14 @@ void MacroAssembler::membar(uint32_t order_constraint) {
// can do this simply by ORing them together.
set_membar_kind(prev, get_membar_kind(prev) | order_constraint);
BLOCK_COMMENT("merged membar");
- } else {
- code()->set_last_insn(pc());
-
- uint32_t predecessor = 0;
- uint32_t successor = 0;
-
- membar_mask_to_pred_succ(order_constraint, predecessor, successor);
- fence(predecessor, successor);
+ return;
}
+
+ code()->set_last_insn(pc());
+ uint32_t predecessor = 0;
+ uint32_t successor = 0;
+ membar_mask_to_pred_succ(order_constraint, predecessor, successor);
+ fence(predecessor, successor);
}
void MacroAssembler::cmodx_fence() {
@@ -6356,10 +6486,17 @@ void MacroAssembler::lightweight_lock(Register basic_lock, Register obj, Registe
ld(mark, Address(obj, oopDesc::mark_offset_in_bytes()));
if (UseObjectMonitorTable) {
- // Clear cache in case fast locking succeeds.
+ // Clear cache in case fast locking succeeds or we need to take the slow-path.
sd(zr, Address(basic_lock, BasicObjectLock::lock_offset() + in_ByteSize((BasicLock::object_monitor_cache_offset_in_bytes()))));
}
+ if (DiagnoseSyncOnValueBasedClasses != 0) {
+ load_klass(tmp1, obj);
+ lbu(tmp1, Address(tmp1, Klass::misc_flags_offset()));
+ test_bit(tmp1, tmp1, exact_log2(KlassFlags::_misc_is_value_based_class));
+ bnez(tmp1, slow, /* is_far */ true);
+ }
+
// Check if the lock-stack is full.
lwu(top, Address(xthread, JavaThread::lock_stack_top_offset()));
mv(t, (unsigned)LockStack::end_offset());
diff --git a/src/hotspot/cpu/riscv/macroAssembler_riscv.hpp b/src/hotspot/cpu/riscv/macroAssembler_riscv.hpp
index 5d36a5f6fcd91..c47200579c785 100644
--- a/src/hotspot/cpu/riscv/macroAssembler_riscv.hpp
+++ b/src/hotspot/cpu/riscv/macroAssembler_riscv.hpp
@@ -122,8 +122,8 @@ class MacroAssembler: public Assembler {
Register arg_1, Register arg_2, Register arg_3,
bool check_exceptions = true);
- void get_vm_result(Register oop_result, Register java_thread);
- void get_vm_result_2(Register metadata_result, Register java_thread);
+ void get_vm_result_oop(Register oop_result, Register java_thread);
+ void get_vm_result_metadata(Register metadata_result, Register java_thread);
// These always tightly bind to MacroAssembler::call_VM_leaf_base
// bypassing the virtual implementation
@@ -417,15 +417,17 @@ class MacroAssembler: public Assembler {
// We used four bit to indicate the read and write bits in the predecessors and successors,
// and extended i for r, o for w if UseConservativeFence enabled.
enum Membar_mask_bits {
- StoreStore = 0b0101, // (pred = ow + succ = ow)
- LoadStore = 0b1001, // (pred = ir + succ = ow)
- StoreLoad = 0b0110, // (pred = ow + succ = ir)
- LoadLoad = 0b1010, // (pred = ir + succ = ir)
- AnyAny = LoadStore | StoreLoad // (pred = iorw + succ = iorw)
+ StoreStore = 0b0101, // (pred = w + succ = w)
+ LoadStore = 0b1001, // (pred = r + succ = w)
+ StoreLoad = 0b0110, // (pred = w + succ = r)
+ LoadLoad = 0b1010, // (pred = r + succ = r)
+ AnyAny = LoadStore | StoreLoad // (pred = rw + succ = rw)
};
void membar(uint32_t order_constraint);
+ private:
+
static void membar_mask_to_pred_succ(uint32_t order_constraint,
uint32_t& predecessor, uint32_t& successor) {
predecessor = (order_constraint >> 2) & 0x3;
@@ -437,7 +439,7 @@ class MacroAssembler: public Assembler {
// 11(rw)-> 1111(iorw)
if (UseConservativeFence) {
predecessor |= predecessor << 2;
- successor |= successor << 2;
+ successor |= successor << 2;
}
}
@@ -445,25 +447,13 @@ class MacroAssembler: public Assembler {
return ((predecessor & 0x3) << 2) | (successor & 0x3);
}
- void fence(uint32_t predecessor, uint32_t successor) {
- if (UseZtso) {
- if ((pred_succ_to_membar_mask(predecessor, successor) & StoreLoad) == StoreLoad) {
- // TSO allows for stores to be reordered after loads. When the compiler
- // generates a fence to disallow that, we are required to generate the
- // fence for correctness.
- Assembler::fence(predecessor, successor);
- } else {
- // TSO guarantees other fences already.
- }
- } else {
- // always generate fence for RVWMO
- Assembler::fence(predecessor, successor);
- }
- }
+ public:
void cmodx_fence();
void pause() {
+ // Zihintpause
+ // PAUSE is encoded as a FENCE instruction with pred=W, succ=0, fm=0, rd=x0, and rs1=x0.
Assembler::fence(w, 0);
}
@@ -667,6 +657,11 @@ class MacroAssembler: public Assembler {
void cmov_gt(Register cmp1, Register cmp2, Register dst, Register src);
void cmov_gtu(Register cmp1, Register cmp2, Register dst, Register src);
+ void cmov_cmp_fp_eq(FloatRegister cmp1, FloatRegister cmp2, Register dst, Register src, bool is_single);
+ void cmov_cmp_fp_ne(FloatRegister cmp1, FloatRegister cmp2, Register dst, Register src, bool is_single);
+ void cmov_cmp_fp_le(FloatRegister cmp1, FloatRegister cmp2, Register dst, Register src, bool is_single);
+ void cmov_cmp_fp_lt(FloatRegister cmp1, FloatRegister cmp2, Register dst, Register src, bool is_single);
+
public:
// We try to follow risc-v asm menomics.
// But as we don't layout a reachable GOT,
@@ -680,9 +675,9 @@ class MacroAssembler: public Assembler {
// JALR, return address stack updates:
// | rd is x1/x5 | rs1 is x1/x5 | rd=rs1 | RAS action
// | ----------- | ------------ | ------ |-------------
- // | No | No | — | None
- // | No | Yes | — | Pop
- // | Yes | No | — | Push
+ // | No | No | - | None
+ // | No | Yes | - | Pop
+ // | Yes | No | - | Push
// | Yes | Yes | No | Pop, then push
// | Yes | Yes | Yes | Push
//
diff --git a/src/hotspot/cpu/riscv/nativeInst_riscv.hpp b/src/hotspot/cpu/riscv/nativeInst_riscv.hpp
index 295e92bbc1b78..d8f5fa57816f3 100644
--- a/src/hotspot/cpu/riscv/nativeInst_riscv.hpp
+++ b/src/hotspot/cpu/riscv/nativeInst_riscv.hpp
@@ -300,7 +300,7 @@ class NativeGeneralJump: public NativeJump {
inline NativeGeneralJump* nativeGeneralJump_at(address addr) {
assert_cond(addr != nullptr);
NativeGeneralJump* jump = (NativeGeneralJump*)(addr);
- debug_only(jump->verify();)
+ DEBUG_ONLY(jump->verify();)
return jump;
}
diff --git a/src/hotspot/cpu/riscv/riscv.ad b/src/hotspot/cpu/riscv/riscv.ad
index 59171d84c9b80..f6fb2e195d3d8 100644
--- a/src/hotspot/cpu/riscv/riscv.ad
+++ b/src/hotspot/cpu/riscv/riscv.ad
@@ -1596,7 +1596,8 @@ uint MachSpillCopyNode::implementation(C2_MacroAssembler *masm, PhaseRegAlloc *r
__ unspill(as_VectorRegister(Matcher::_regEncode[dst_lo]), ra_->reg2offset(src_lo));
} else if (src_lo_rc == rc_vector && dst_lo_rc == rc_vector) {
// vpr to vpr
- __ vmv1r_v(as_VectorRegister(Matcher::_regEncode[dst_lo]), as_VectorRegister(Matcher::_regEncode[src_lo]));
+ __ vsetvli_helper(T_BYTE, MaxVectorSize);
+ __ vmv_v_v(as_VectorRegister(Matcher::_regEncode[dst_lo]), as_VectorRegister(Matcher::_regEncode[src_lo]));
} else {
ShouldNotReachHere();
}
@@ -1614,7 +1615,8 @@ uint MachSpillCopyNode::implementation(C2_MacroAssembler *masm, PhaseRegAlloc *r
__ unspill_vmask(as_VectorRegister(Matcher::_regEncode[dst_lo]), ra_->reg2offset(src_lo));
} else if (src_lo_rc == rc_vector && dst_lo_rc == rc_vector) {
// vmask to vmask
- __ vmv1r_v(as_VectorRegister(Matcher::_regEncode[dst_lo]), as_VectorRegister(Matcher::_regEncode[src_lo]));
+ __ vsetvli_helper(T_BYTE, MaxVectorSize >> 3);
+ __ vmv_v_v(as_VectorRegister(Matcher::_regEncode[dst_lo]), as_VectorRegister(Matcher::_regEncode[src_lo]));
} else {
ShouldNotReachHere();
}
@@ -1914,9 +1916,10 @@ bool Matcher::match_rule_supported(int opcode) {
case Op_FmaF:
case Op_FmaD:
+ return UseFMA;
case Op_FmaVF:
case Op_FmaVD:
- return UseFMA;
+ return UseRVV && UseFMA;
case Op_ConvHF2F:
case Op_ConvF2HF:
@@ -1933,6 +1936,12 @@ bool Matcher::match_rule_supported(int opcode) {
case Op_SubHF:
case Op_SqrtHF:
return UseZfh;
+
+ case Op_CMoveF:
+ case Op_CMoveD:
+ case Op_CMoveP:
+ case Op_CMoveN:
+ return false;
}
return true; // Per default match rules are supported.
@@ -1944,11 +1953,11 @@ const RegMask* Matcher::predicate_reg_mask(void) {
// Vector calling convention not yet implemented.
bool Matcher::supports_vector_calling_convention(void) {
- return EnableVectorSupport && UseVectorStubs;
+ return EnableVectorSupport;
}
OptoRegPair Matcher::vector_return_value(uint ideal_reg) {
- assert(EnableVectorSupport && UseVectorStubs, "sanity");
+ assert(EnableVectorSupport, "sanity");
assert(ideal_reg == Op_VecA, "sanity");
// check more info at https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc
int lo = V8_num;
@@ -4397,6 +4406,12 @@ pipe_class pipe_slow()
LDST : MEM;
%}
+// The real do-nothing guy
+pipe_class real_empty()
+%{
+ instruction_count(0);
+%}
+
// Empty pipeline class
pipe_class pipe_class_empty()
%{
@@ -6439,7 +6454,6 @@ instruct addI_reg_imm(iRegINoSp dst, iRegIorL2I src1, immIAdd src2) %{
format %{ "addiw $dst, $src1, $src2\t#@addI_reg_imm" %}
ins_encode %{
- int32_t con = (int32_t)$src2$$constant;
__ addiw(as_Register($dst$$reg),
as_Register($src1$$reg),
$src2$$constant);
@@ -6501,7 +6515,6 @@ instruct addP_reg_imm(iRegPNoSp dst, iRegP src1, immLAdd src2) %{
format %{ "addi $dst, $src1, $src2\t# ptr, #@addP_reg_imm" %}
ins_encode %{
- // src2 is imm, so actually call the addi
__ addi(as_Register($dst$$reg),
as_Register($src1$$reg),
$src2$$constant);
@@ -6823,7 +6836,7 @@ instruct UmodL(iRegLNoSp dst, iRegL src1, iRegL src2) %{
// Integer Shifts
// Shift Left Register
-// In RV64I, only the low 5 bits of src2 are considered for the shift amount
+// Only the low 5 bits of src2 are considered for the shift amount, all other bits are ignored.
instruct lShiftI_reg_reg(iRegINoSp dst, iRegIorL2I src1, iRegIorL2I src2) %{
match(Set dst (LShiftI src1 src2));
ins_cost(ALU_COST);
@@ -6856,7 +6869,7 @@ instruct lShiftI_reg_imm(iRegINoSp dst, iRegIorL2I src1, immI src2) %{
%}
// Shift Right Logical Register
-// In RV64I, only the low 5 bits of src2 are considered for the shift amount
+// Only the low 5 bits of src2 are considered for the shift amount, all other bits are ignored.
instruct urShiftI_reg_reg(iRegINoSp dst, iRegIorL2I src1, iRegIorL2I src2) %{
match(Set dst (URShiftI src1 src2));
ins_cost(ALU_COST);
@@ -6889,7 +6902,7 @@ instruct urShiftI_reg_imm(iRegINoSp dst, iRegIorL2I src1, immI src2) %{
%}
// Shift Right Arithmetic Register
-// In RV64I, only the low 5 bits of src2 are considered for the shift amount
+// Only the low 5 bits of src2 are considered for the shift amount, all other bits are ignored.
instruct rShiftI_reg_reg(iRegINoSp dst, iRegIorL2I src1, iRegIorL2I src2) %{
match(Set dst (RShiftI src1 src2));
ins_cost(ALU_COST);
@@ -6924,7 +6937,7 @@ instruct rShiftI_reg_imm(iRegINoSp dst, iRegIorL2I src1, immI src2) %{
// Long Shifts
// Shift Left Register
-// In RV64I, only the low 6 bits of src2 are considered for the shift amount
+// Only the low 6 bits of src2 are considered for the shift amount, all other bits are ignored.
instruct lShiftL_reg_reg(iRegLNoSp dst, iRegL src1, iRegIorL2I src2) %{
match(Set dst (LShiftL src1 src2));
@@ -6959,7 +6972,7 @@ instruct lShiftL_reg_imm(iRegLNoSp dst, iRegL src1, immI src2) %{
%}
// Shift Right Logical Register
-// In RV64I, only the low 6 bits of src2 are considered for the shift amount
+// Only the low 6 bits of src2 are considered for the shift amount, all other bits are ignored.
instruct urShiftL_reg_reg(iRegLNoSp dst, iRegL src1, iRegIorL2I src2) %{
match(Set dst (URShiftL src1 src2));
@@ -7012,7 +7025,7 @@ instruct urShiftP_reg_imm(iRegLNoSp dst, iRegP src1, immI src2) %{
%}
// Shift Right Arithmetic Register
-// In RV64I, only the low 6 bits of src2 are considered for the shift amount
+// Only the low 6 bits of src2 are considered for the shift amount, all other bits are ignored.
instruct rShiftL_reg_reg(iRegLNoSp dst, iRegL src1, iRegIorL2I src2) %{
match(Set dst (RShiftL src1 src2));
@@ -7902,78 +7915,102 @@ instruct xorL_reg_imm(iRegLNoSp dst, iRegL src1, immLAdd src2) %{
// ============================================================================
// MemBar Instruction
-instruct load_fence() %{
+// RVTSO
+
+instruct unnecessary_membar_rvtso() %{
+ predicate(UseZtso);
match(LoadFence);
- ins_cost(ALU_COST);
+ match(StoreFence);
+ match(StoreStoreFence);
+ match(MemBarAcquire);
+ match(MemBarRelease);
+ match(MemBarStoreStore);
+ match(MemBarAcquireLock);
+ match(MemBarReleaseLock);
- format %{ "#@load_fence" %}
+ ins_cost(0);
+ size(0);
+
+ format %{ "#@unnecessary_membar_rvtso elided/tso (empty encoding)" %}
ins_encode %{
- __ membar(MacroAssembler::LoadLoad | MacroAssembler::LoadStore);
+ __ block_comment("unnecessary_membar_rvtso");
%}
- ins_pipe(pipe_serial);
+ ins_pipe(real_empty);
%}
-instruct membar_acquire() %{
- match(MemBarAcquire);
- ins_cost(ALU_COST);
+instruct membar_volatile_rvtso() %{
+ predicate(UseZtso);
+ match(MemBarVolatile);
+ ins_cost(VOLATILE_REF_COST);
- format %{ "#@membar_acquire\n\t"
- "fence ir iorw" %}
+ format %{ "#@membar_volatile_rvtso\n\t"
+ "fence w, r"%}
ins_encode %{
- __ block_comment("membar_acquire");
- __ membar(MacroAssembler::LoadLoad | MacroAssembler::LoadStore);
+ __ block_comment("membar_volatile_rvtso");
+ __ membar(MacroAssembler::StoreLoad);
%}
- ins_pipe(pipe_serial);
+ ins_pipe(pipe_slow);
%}
-instruct membar_acquire_lock() %{
- match(MemBarAcquireLock);
+instruct unnecessary_membar_volatile_rvtso() %{
+ predicate(UseZtso && Matcher::post_store_load_barrier(n));
+ match(MemBarVolatile);
ins_cost(0);
- format %{ "#@membar_acquire_lock (elided)" %}
-
+ size(0);
+
+ format %{ "#@unnecessary_membar_volatile_rvtso (unnecessary so empty encoding)" %}
ins_encode %{
- __ block_comment("membar_acquire_lock (elided)");
+ __ block_comment("unnecessary_membar_volatile_rvtso");
%}
-
- ins_pipe(pipe_serial);
+ ins_pipe(real_empty);
%}
-instruct store_fence() %{
- match(StoreFence);
- ins_cost(ALU_COST);
+// RVWMO
- format %{ "#@store_fence" %}
+instruct membar_aqcuire_rvwmo() %{
+ predicate(!UseZtso);
+ match(LoadFence);
+ match(MemBarAcquire);
+ ins_cost(VOLATILE_REF_COST);
+
+ format %{ "#@membar_aqcuire_rvwmo\n\t"
+ "fence r, rw" %}
ins_encode %{
- __ membar(MacroAssembler::LoadStore | MacroAssembler::StoreStore);
+ __ block_comment("membar_aqcuire_rvwmo");
+ __ membar(MacroAssembler::LoadLoad | MacroAssembler::LoadStore);
%}
ins_pipe(pipe_serial);
%}
-instruct membar_release() %{
+instruct membar_release_rvwmo() %{
+ predicate(!UseZtso);
+ match(StoreFence);
match(MemBarRelease);
- ins_cost(ALU_COST);
+ ins_cost(VOLATILE_REF_COST);
- format %{ "#@membar_release\n\t"
- "fence iorw ow" %}
+ format %{ "#@membar_release_rvwmo\n\t"
+ "fence rw, w" %}
ins_encode %{
- __ block_comment("membar_release");
+ __ block_comment("membar_release_rvwmo");
__ membar(MacroAssembler::LoadStore | MacroAssembler::StoreStore);
%}
ins_pipe(pipe_serial);
%}
-instruct membar_storestore() %{
+instruct membar_storestore_rvwmo() %{
+ predicate(!UseZtso);
match(MemBarStoreStore);
match(StoreStoreFence);
- ins_cost(ALU_COST);
+ ins_cost(VOLATILE_REF_COST);
- format %{ "MEMBAR-store-store\t#@membar_storestore" %}
+ format %{ "#@membar_storestore_rvwmo\n\t"
+ "fence w, w" %}
ins_encode %{
__ membar(MacroAssembler::StoreStore);
@@ -7981,34 +8018,50 @@ instruct membar_storestore() %{
ins_pipe(pipe_serial);
%}
-instruct membar_release_lock() %{
- match(MemBarReleaseLock);
- ins_cost(0);
+instruct membar_volatile_rvwmo() %{
+ predicate(!UseZtso);
+ match(MemBarVolatile);
+ ins_cost(VOLATILE_REF_COST);
- format %{ "#@membar_release_lock (elided)" %}
+ format %{ "#@membar_volatile_rvwmo\n\t"
+ "fence w, r"%}
ins_encode %{
- __ block_comment("membar_release_lock (elided)");
+ __ block_comment("membar_volatile_rvwmo");
+ __ membar(MacroAssembler::StoreLoad);
%}
ins_pipe(pipe_serial);
%}
-instruct membar_volatile() %{
- match(MemBarVolatile);
- ins_cost(ALU_COST);
+instruct membar_lock_rvwmo() %{
+ predicate(!UseZtso);
+ match(MemBarAcquireLock);
+ match(MemBarReleaseLock);
+ ins_cost(0);
- format %{ "#@membar_volatile\n\t"
- "fence iorw iorw"%}
+ format %{ "#@membar_lock_rvwmo (elided)" %}
ins_encode %{
- __ block_comment("membar_volatile");
- __ membar(MacroAssembler::StoreLoad);
+ __ block_comment("membar_lock_rvwmo (elided)");
%}
ins_pipe(pipe_serial);
%}
+instruct unnecessary_membar_volatile_rvwmo() %{
+ predicate(!UseZtso && Matcher::post_store_load_barrier(n));
+ match(MemBarVolatile);
+ ins_cost(0);
+
+ size(0);
+ format %{ "#@unnecessary_membar_volatile_rvwmo (unnecessary so empty encoding)" %}
+ ins_encode %{
+ __ block_comment("unnecessary_membar_volatile_rvwmo");
+ %}
+ ins_pipe(real_empty);
+%}
+
instruct spin_wait() %{
predicate(UseZihintpause);
match(OnSpinWait);
@@ -9894,12 +9947,15 @@ instruct far_cmpP_narrowOop_imm0_branch(cmpOpEqNe cmp, iRegN op1, immP0 zero, la
// ============================================================================
// Conditional Move Instructions
+
+// --------- CMoveI ---------
+
instruct cmovI_cmpI(iRegINoSp dst, iRegI src, iRegI op1, iRegI op2, cmpOp cop) %{
match(Set dst (CMoveI (Binary cop (CmpI op1 op2)) (Binary dst src)));
ins_cost(ALU_COST + BRANCH_COST);
format %{
- "CMove $dst, ($op1 $cop $op2), $dst, $src\t#@cmovI_cmpI\n\t"
+ "CMoveI $dst, ($op1 $cop $op2), $dst, $src\t#@cmovI_cmpI\n\t"
%}
ins_encode %{
@@ -9916,7 +9972,7 @@ instruct cmovI_cmpU(iRegINoSp dst, iRegI src, iRegI op1, iRegI op2, cmpOpU cop)
ins_cost(ALU_COST + BRANCH_COST);
format %{
- "CMove $dst, ($op1 $cop $op2), $dst, $src\t#@cmovI_cmpU\n\t"
+ "CMoveI $dst, ($op1 $cop $op2), $dst, $src\t#@cmovI_cmpU\n\t"
%}
ins_encode %{
@@ -9933,7 +9989,7 @@ instruct cmovI_cmpL(iRegINoSp dst, iRegI src, iRegL op1, iRegL op2, cmpOp cop) %
ins_cost(ALU_COST + BRANCH_COST);
format %{
- "CMove $dst, ($op1 $cop $op2), $dst, $src\t#@cmovI_cmpL\n\t"
+ "CMoveI $dst, ($op1 $cop $op2), $dst, $src\t#@cmovI_cmpL\n\t"
%}
ins_encode %{
@@ -9950,7 +10006,7 @@ instruct cmovI_cmpUL(iRegINoSp dst, iRegI src, iRegL op1, iRegL op2, cmpOpU cop)
ins_cost(ALU_COST + BRANCH_COST);
format %{
- "CMove $dst, ($op1 $cop $op2), $dst, $src\t#@cmovI_cmpUL\n\t"
+ "CMoveI $dst, ($op1 $cop $op2), $dst, $src\t#@cmovI_cmpUL\n\t"
%}
ins_encode %{
@@ -9962,12 +10018,46 @@ instruct cmovI_cmpUL(iRegINoSp dst, iRegI src, iRegL op1, iRegL op2, cmpOpU cop)
ins_pipe(pipe_class_compare);
%}
+instruct cmovI_cmpF(iRegINoSp dst, iRegI src, fRegF op1, fRegF op2, cmpOp cop) %{
+ match(Set dst (CMoveI (Binary cop (CmpF op1 op2)) (Binary dst src)));
+ ins_cost(ALU_COST + BRANCH_COST);
+
+ format %{
+ "CMoveI $dst, ($op1 $cop $op2), $dst, $src\t#@cmovI_cmpF\n\t"
+ %}
+
+ ins_encode %{
+ __ enc_cmove_cmp_fp($cop$$cmpcode,
+ as_FloatRegister($op1$$reg), as_FloatRegister($op2$$reg),
+ as_Register($dst$$reg), as_Register($src$$reg), true /* is_single */);
+ %}
+
+ ins_pipe(pipe_class_compare);
+%}
+
+instruct cmovI_cmpD(iRegINoSp dst, iRegI src, fRegD op1, fRegD op2, cmpOp cop) %{
+ match(Set dst (CMoveI (Binary cop (CmpD op1 op2)) (Binary dst src)));
+ ins_cost(ALU_COST + BRANCH_COST);
+
+ format %{
+ "CMoveI $dst, ($op1 $cop $op2), $dst, $src\t#@cmovI_cmpD\n\t"
+ %}
+
+ ins_encode %{
+ __ enc_cmove_cmp_fp($cop$$cmpcode | C2_MacroAssembler::double_branch_mask,
+ as_FloatRegister($op1$$reg), as_FloatRegister($op2$$reg),
+ as_Register($dst$$reg), as_Register($src$$reg), false /* is_single */);
+ %}
+
+ ins_pipe(pipe_class_compare);
+%}
+
instruct cmovI_cmpN(iRegINoSp dst, iRegI src, iRegN op1, iRegN op2, cmpOpU cop) %{
match(Set dst (CMoveI (Binary cop (CmpN op1 op2)) (Binary dst src)));
ins_cost(ALU_COST + BRANCH_COST);
format %{
- "CMove $dst, ($op1 $cop $op2), $dst, $src\t#@cmovI_cmpN\n\t"
+ "CMoveI $dst, ($op1 $cop $op2), $dst, $src\t#@cmovI_cmpN\n\t"
%}
ins_encode %{
@@ -9984,7 +10074,7 @@ instruct cmovI_cmpP(iRegINoSp dst, iRegI src, iRegP op1, iRegP op2, cmpOpU cop)
ins_cost(ALU_COST + BRANCH_COST);
format %{
- "CMove $dst, ($op1 $cop $op2), $dst, $src\t#@cmovI_cmpP\n\t"
+ "CMoveI $dst, ($op1 $cop $op2), $dst, $src\t#@cmovI_cmpP\n\t"
%}
ins_encode %{
@@ -9996,12 +10086,14 @@ instruct cmovI_cmpP(iRegINoSp dst, iRegI src, iRegP op1, iRegP op2, cmpOpU cop)
ins_pipe(pipe_class_compare);
%}
+// --------- CMoveL ---------
+
instruct cmovL_cmpL(iRegLNoSp dst, iRegL src, iRegL op1, iRegL op2, cmpOp cop) %{
match(Set dst (CMoveL (Binary cop (CmpL op1 op2)) (Binary dst src)));
ins_cost(ALU_COST + BRANCH_COST);
format %{
- "CMove $dst, ($op1 $cop $op2), $dst, $src\t#@cmovL_cmpL\n\t"
+ "CMoveL $dst, ($op1 $cop $op2), $dst, $src\t#@cmovL_cmpL\n\t"
%}
ins_encode %{
@@ -10018,7 +10110,7 @@ instruct cmovL_cmpUL(iRegLNoSp dst, iRegL src, iRegL op1, iRegL op2, cmpOpU cop)
ins_cost(ALU_COST + BRANCH_COST);
format %{
- "CMove $dst, ($op1 $cop $op2), $dst, $src\t#@cmovL_cmpUL\n\t"
+ "CMoveL $dst, ($op1 $cop $op2), $dst, $src\t#@cmovL_cmpUL\n\t"
%}
ins_encode %{
@@ -10035,7 +10127,7 @@ instruct cmovL_cmpI(iRegLNoSp dst, iRegL src, iRegI op1, iRegI op2, cmpOp cop) %
ins_cost(ALU_COST + BRANCH_COST);
format %{
- "CMove $dst, ($op1 $cop $op2), $dst, $src\t#@cmovL_cmpI\n\t"
+ "CMoveL $dst, ($op1 $cop $op2), $dst, $src\t#@cmovL_cmpI\n\t"
%}
ins_encode %{
@@ -10052,7 +10144,7 @@ instruct cmovL_cmpU(iRegLNoSp dst, iRegL src, iRegI op1, iRegI op2, cmpOpU cop)
ins_cost(ALU_COST + BRANCH_COST);
format %{
- "CMove $dst, ($op1 $cop $op2), $dst, $src\t#@cmovL_cmpU\n\t"
+ "CMoveL $dst, ($op1 $cop $op2), $dst, $src\t#@cmovL_cmpU\n\t"
%}
ins_encode %{
@@ -10064,12 +10156,46 @@ instruct cmovL_cmpU(iRegLNoSp dst, iRegL src, iRegI op1, iRegI op2, cmpOpU cop)
ins_pipe(pipe_class_compare);
%}
+instruct cmovL_cmpF(iRegLNoSp dst, iRegL src, fRegF op1, fRegF op2, cmpOp cop) %{
+ match(Set dst (CMoveL (Binary cop (CmpF op1 op2)) (Binary dst src)));
+ ins_cost(ALU_COST + BRANCH_COST);
+
+ format %{
+ "CMoveL $dst, ($op1 $cop $op2), $dst, $src\t#@cmovL_cmpF\n\t"
+ %}
+
+ ins_encode %{
+ __ enc_cmove_cmp_fp($cop$$cmpcode,
+ as_FloatRegister($op1$$reg), as_FloatRegister($op2$$reg),
+ as_Register($dst$$reg), as_Register($src$$reg), true /* is_single */);
+ %}
+
+ ins_pipe(pipe_class_compare);
+%}
+
+instruct cmovL_cmpD(iRegLNoSp dst, iRegL src, fRegD op1, fRegD op2, cmpOp cop) %{
+ match(Set dst (CMoveL (Binary cop (CmpD op1 op2)) (Binary dst src)));
+ ins_cost(ALU_COST + BRANCH_COST);
+
+ format %{
+ "CMoveL $dst, ($op1 $cop $op2), $dst, $src\t#@cmovL_cmpD\n\t"
+ %}
+
+ ins_encode %{
+ __ enc_cmove_cmp_fp($cop$$cmpcode | C2_MacroAssembler::double_branch_mask,
+ as_FloatRegister($op1$$reg), as_FloatRegister($op2$$reg),
+ as_Register($dst$$reg), as_Register($src$$reg), false /* is_single */);
+ %}
+
+ ins_pipe(pipe_class_compare);
+%}
+
instruct cmovL_cmpN(iRegLNoSp dst, iRegL src, iRegN op1, iRegN op2, cmpOpU cop) %{
match(Set dst (CMoveL (Binary cop (CmpN op1 op2)) (Binary dst src)));
ins_cost(ALU_COST + BRANCH_COST);
format %{
- "CMove $dst, ($op1 $cop $op2), $dst, $src\t#@cmovL_cmpN\n\t"
+ "CMoveL $dst, ($op1 $cop $op2), $dst, $src\t#@cmovL_cmpN\n\t"
%}
ins_encode %{
@@ -10086,7 +10212,7 @@ instruct cmovL_cmpP(iRegLNoSp dst, iRegL src, iRegP op1, iRegP op2, cmpOpU cop)
ins_cost(ALU_COST + BRANCH_COST);
format %{
- "CMove $dst, ($op1 $cop $op2), $dst, $src\t#@cmovL_cmpP\n\t"
+ "CMoveL $dst, ($op1 $cop $op2), $dst, $src\t#@cmovL_cmpP\n\t"
%}
ins_encode %{
@@ -10847,7 +10973,8 @@ instruct ShouldNotReachHere() %{
ins_encode %{
if (is_reachable()) {
- __ stop(_halt_reason);
+ const char* str = __ code_string(_halt_reason);
+ __ stop(str);
}
%}
diff --git a/src/hotspot/cpu/riscv/riscv_b.ad b/src/hotspot/cpu/riscv/riscv_b.ad
index ed9fca13a1b06..beac10ec03d04 100644
--- a/src/hotspot/cpu/riscv/riscv_b.ad
+++ b/src/hotspot/cpu/riscv/riscv_b.ad
@@ -25,7 +25,8 @@
// RISCV Bit-Manipulation Extension Architecture Description File
-instruct rorI_imm_b(iRegINoSp dst, iRegI src, immI shift) %{
+// Rotate Right Word Immediate
+instruct rorI_imm_b(iRegINoSp dst, iRegIorL2I src, immI shift) %{
predicate(UseZbb);
match(Set dst (RotateRight src shift));
@@ -39,6 +40,7 @@ instruct rorI_imm_b(iRegINoSp dst, iRegI src, immI shift) %{
ins_pipe(ialu_reg_shift);
%}
+// Rotate Right Immediate
instruct rorL_imm_b(iRegLNoSp dst, iRegL src, immI shift) %{
predicate(UseZbb);
match(Set dst (RotateRight src shift));
@@ -53,7 +55,9 @@ instruct rorL_imm_b(iRegLNoSp dst, iRegL src, immI shift) %{
ins_pipe(ialu_reg_shift);
%}
-instruct rorI_reg_b(iRegINoSp dst, iRegI src, iRegI shift) %{
+// Rotate Right Word Register
+// Only the low 5 bits of shift value are used, all other bits are ignored.
+instruct rorI_reg_b(iRegINoSp dst, iRegIorL2I src, iRegIorL2I shift) %{
predicate(UseZbb);
match(Set dst (RotateRight src shift));
@@ -65,7 +69,9 @@ instruct rorI_reg_b(iRegINoSp dst, iRegI src, iRegI shift) %{
ins_pipe(ialu_reg_reg);
%}
-instruct rorL_reg_b(iRegLNoSp dst, iRegL src, iRegI shift) %{
+// Rotate Right Register
+// Only the low 6 bits of shift value are used, all other bits are ignored.
+instruct rorL_reg_b(iRegLNoSp dst, iRegL src, iRegIorL2I shift) %{
predicate(UseZbb);
match(Set dst (RotateRight src shift));
@@ -77,7 +83,9 @@ instruct rorL_reg_b(iRegLNoSp dst, iRegL src, iRegI shift) %{
ins_pipe(ialu_reg_reg);
%}
-instruct rolI_reg_b(iRegINoSp dst, iRegI src, iRegI shift) %{
+// Rotate Left Word Register
+// Only the low 5 bits of shift value are used, all other bits are ignored.
+instruct rolI_reg_b(iRegINoSp dst, iRegIorL2I src, iRegIorL2I shift) %{
predicate(UseZbb);
match(Set dst (RotateLeft src shift));
@@ -89,7 +97,9 @@ instruct rolI_reg_b(iRegINoSp dst, iRegI src, iRegI shift) %{
ins_pipe(ialu_reg_reg);
%}
-instruct rolL_reg_b(iRegLNoSp dst, iRegL src, iRegI shift) %{
+// Rotate Left Register
+// Only the low 6 bits of shift value are used, all other bits are ignored.
+instruct rolL_reg_b(iRegLNoSp dst, iRegL src, iRegIorL2I shift) %{
predicate(UseZbb);
match(Set dst (RotateLeft src shift));
diff --git a/src/hotspot/cpu/riscv/riscv_v.ad b/src/hotspot/cpu/riscv/riscv_v.ad
index cd9d2107b7988..b6b8395dbb755 100644
--- a/src/hotspot/cpu/riscv/riscv_v.ad
+++ b/src/hotspot/cpu/riscv/riscv_v.ad
@@ -80,11 +80,17 @@ source %{
case Op_PopCountVI:
case Op_ReverseBytesV:
case Op_ReverseV:
+ return UseZvbb;
case Op_RotateLeftV:
case Op_RotateRightV:
+ if (bt != T_INT && bt != T_LONG) {
+ return false;
+ }
return UseZvbb;
case Op_LoadVectorGather:
case Op_LoadVectorGatherMasked:
+ case Op_StoreVectorScatter:
+ case Op_StoreVectorScatterMasked:
if (is_subword_type(bt)) {
return false;
}
@@ -245,7 +251,6 @@ instruct vmaskcmp_fp(vRegMask dst, vReg src1, vReg src2, immI cond) %{
predicate(Matcher::vector_element_basic_type(n) == T_FLOAT ||
Matcher::vector_element_basic_type(n) == T_DOUBLE);
match(Set dst (VectorMaskCmp (Binary src1 src2) cond));
- effect(TEMP_DEF dst);
format %{ "vmaskcmp_fp $dst, $src1, $src2, $cond" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
@@ -409,11 +414,11 @@ instruct vadd_fp_masked(vReg dst_src1, vReg src2, vRegMask_V0 v0) %{
// vector-immediate add (unpredicated)
-instruct vadd_immI(vReg dst, vReg src1, immI5 con) %{
+instruct vadd_vi(vReg dst, vReg src1, immI5 con) %{
match(Set dst (AddVB src1 (Replicate con)));
match(Set dst (AddVS src1 (Replicate con)));
match(Set dst (AddVI src1 (Replicate con)));
- format %{ "vadd_immI $dst, $src1, $con" %}
+ format %{ "vadd_vi $dst, $src1, $con" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -424,9 +429,9 @@ instruct vadd_immI(vReg dst, vReg src1, immI5 con) %{
ins_pipe(pipe_slow);
%}
-instruct vadd_immL(vReg dst, vReg src1, immL5 con) %{
+instruct vaddL_vi(vReg dst, vReg src1, immL5 con) %{
match(Set dst (AddVL src1 (Replicate con)));
- format %{ "vadd_immL $dst, $src1, $con" %}
+ format %{ "vaddL_vi $dst, $src1, $con" %}
ins_encode %{
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
__ vadd_vi(as_VectorRegister($dst$$reg),
@@ -438,11 +443,11 @@ instruct vadd_immL(vReg dst, vReg src1, immL5 con) %{
// vector-scalar add (unpredicated)
-instruct vadd_regI(vReg dst, vReg src1, iRegIorL2I src2) %{
+instruct vadd_vx(vReg dst, vReg src1, iRegIorL2I src2) %{
match(Set dst (AddVB src1 (Replicate src2)));
match(Set dst (AddVS src1 (Replicate src2)));
match(Set dst (AddVI src1 (Replicate src2)));
- format %{ "vadd_regI $dst, $src1, $src2" %}
+ format %{ "vadd_vx $dst, $src1, $src2" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -453,9 +458,9 @@ instruct vadd_regI(vReg dst, vReg src1, iRegIorL2I src2) %{
ins_pipe(pipe_slow);
%}
-instruct vadd_regL(vReg dst, vReg src1, iRegL src2) %{
+instruct vaddL_vx(vReg dst, vReg src1, iRegL src2) %{
match(Set dst (AddVL src1 (Replicate src2)));
- format %{ "vadd_regL $dst, $src1, $src2" %}
+ format %{ "vaddL_vx $dst, $src1, $src2" %}
ins_encode %{
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
__ vadd_vx(as_VectorRegister($dst$$reg),
@@ -467,11 +472,11 @@ instruct vadd_regL(vReg dst, vReg src1, iRegL src2) %{
// vector-immediate add (predicated)
-instruct vadd_immI_masked(vReg dst_src, immI5 con, vRegMask_V0 v0) %{
+instruct vadd_vi_masked(vReg dst_src, immI5 con, vRegMask_V0 v0) %{
match(Set dst_src (AddVB (Binary dst_src (Replicate con)) v0));
match(Set dst_src (AddVS (Binary dst_src (Replicate con)) v0));
match(Set dst_src (AddVI (Binary dst_src (Replicate con)) v0));
- format %{ "vadd_immI_masked $dst_src, $dst_src, $con" %}
+ format %{ "vadd_vi_masked $dst_src, $dst_src, $con, $v0" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -482,9 +487,9 @@ instruct vadd_immI_masked(vReg dst_src, immI5 con, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vadd_immL_masked(vReg dst_src, immL5 con, vRegMask_V0 v0) %{
+instruct vaddL_vi_masked(vReg dst_src, immL5 con, vRegMask_V0 v0) %{
match(Set dst_src (AddVL (Binary dst_src (Replicate con)) v0));
- format %{ "vadd_immL_masked $dst_src, $dst_src, $con" %}
+ format %{ "vaddL_vi_masked $dst_src, $dst_src, $con, $v0" %}
ins_encode %{
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
__ vadd_vi(as_VectorRegister($dst_src$$reg),
@@ -496,11 +501,11 @@ instruct vadd_immL_masked(vReg dst_src, immL5 con, vRegMask_V0 v0) %{
// vector-scalar add (predicated)
-instruct vadd_regI_masked(vReg dst_src, iRegIorL2I src2, vRegMask_V0 v0) %{
+instruct vadd_vx_masked(vReg dst_src, iRegIorL2I src2, vRegMask_V0 v0) %{
match(Set dst_src (AddVB (Binary dst_src (Replicate src2)) v0));
match(Set dst_src (AddVS (Binary dst_src (Replicate src2)) v0));
match(Set dst_src (AddVI (Binary dst_src (Replicate src2)) v0));
- format %{ "vadd_regI_masked $dst_src, $dst_src, $src2" %}
+ format %{ "vadd_vx_masked $dst_src, $dst_src, $src2, $v0" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -511,9 +516,9 @@ instruct vadd_regI_masked(vReg dst_src, iRegIorL2I src2, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vadd_regL_masked(vReg dst_src, iRegL src2, vRegMask_V0 v0) %{
+instruct vaddL_vx_masked(vReg dst_src, iRegL src2, vRegMask_V0 v0) %{
match(Set dst_src (AddVL (Binary dst_src (Replicate src2)) v0));
- format %{ "vadd_regL_masked $dst_src, $dst_src, $src2" %}
+ format %{ "vaddL_vx_masked $dst_src, $dst_src, $src2, $v0" %}
ins_encode %{
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
__ vadd_vx(as_VectorRegister($dst_src$$reg),
@@ -589,11 +594,11 @@ instruct vsub_fp_masked(vReg dst_src1, vReg src2, vRegMask_V0 v0) %{
// vector-scalar sub (unpredicated)
-instruct vsub_regI(vReg dst, vReg src1, iRegIorL2I src2) %{
+instruct vsub_vx(vReg dst, vReg src1, iRegIorL2I src2) %{
match(Set dst (SubVB src1 (Replicate src2)));
match(Set dst (SubVS src1 (Replicate src2)));
match(Set dst (SubVI src1 (Replicate src2)));
- format %{ "vsub_regI $dst, $src1, $src2" %}
+ format %{ "vsub_vx $dst, $src1, $src2" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -604,9 +609,9 @@ instruct vsub_regI(vReg dst, vReg src1, iRegIorL2I src2) %{
ins_pipe(pipe_slow);
%}
-instruct vsub_regL(vReg dst, vReg src1, iRegL src2) %{
+instruct vsubL_vx(vReg dst, vReg src1, iRegL src2) %{
match(Set dst (SubVL src1 (Replicate src2)));
- format %{ "vsub_regL $dst, $src1, $src2" %}
+ format %{ "vsubL_vx $dst, $src1, $src2" %}
ins_encode %{
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
__ vsub_vx(as_VectorRegister($dst$$reg),
@@ -618,11 +623,11 @@ instruct vsub_regL(vReg dst, vReg src1, iRegL src2) %{
// vector-scalar sub (predicated)
-instruct vsub_regI_masked(vReg dst_src, iRegIorL2I src2, vRegMask_V0 v0) %{
+instruct vsub_vx_masked(vReg dst_src, iRegIorL2I src2, vRegMask_V0 v0) %{
match(Set dst_src (SubVB (Binary dst_src (Replicate src2)) v0));
match(Set dst_src (SubVS (Binary dst_src (Replicate src2)) v0));
match(Set dst_src (SubVI (Binary dst_src (Replicate src2)) v0));
- format %{ "vsub_regI_masked $dst_src, $dst_src, $src2" %}
+ format %{ "vsub_vx_masked $dst_src, $dst_src, $src2, $v0" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -633,9 +638,9 @@ instruct vsub_regI_masked(vReg dst_src, iRegIorL2I src2, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vsub_regL_masked(vReg dst_src, iRegL src2, vRegMask_V0 v0) %{
+instruct vsubL_vx_masked(vReg dst_src, iRegL src2, vRegMask_V0 v0) %{
match(Set dst_src (SubVL (Binary dst_src (Replicate src2)) v0));
- format %{ "vsub_regL_masked $dst_src, $dst_src, $src2" %}
+ format %{ "vsubL_vx_masked $dst_src, $dst_src, $src2, $v0" %}
ins_encode %{
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
__ vsub_vx(as_VectorRegister($dst_src$$reg),
@@ -645,6 +650,144 @@ instruct vsub_regL_masked(vReg dst_src, iRegL src2, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
+// -------- vector saturating integer operations
+
+// vector saturating signed integer addition
+
+instruct vsadd(vReg dst, vReg src1, vReg src2) %{
+ predicate(n->is_SaturatingVector() && !n->as_SaturatingVector()->is_unsigned());
+ match(Set dst (SaturatingAddV src1 src2));
+ ins_cost(VEC_COST);
+ format %{ "vsadd $dst, $src1, $src2" %}
+ ins_encode %{
+ BasicType bt = Matcher::vector_element_basic_type(this);
+ assert(is_integral_type(bt), "unsupported type");
+ __ vsetvli_helper(bt, Matcher::vector_length(this));
+ __ vsadd_vv(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg), as_VectorRegister($src2$$reg));
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+// vector saturating unsigned integer addition
+
+instruct vsaddu(vReg dst, vReg src1, vReg src2) %{
+ predicate(n->is_SaturatingVector() && n->as_SaturatingVector()->is_unsigned());
+ match(Set dst (SaturatingAddV src1 src2));
+ ins_cost(VEC_COST);
+ format %{ "vsaddu $dst, $src1, $src2" %}
+ ins_encode %{
+ BasicType bt = Matcher::vector_element_basic_type(this);
+ assert(is_integral_type(bt), "unsupported type");
+ __ vsetvli_helper(bt, Matcher::vector_length(this));
+ __ vsaddu_vv(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg), as_VectorRegister($src2$$reg));
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+// vector saturating signed integer addition (predicated)
+
+instruct vsadd_masked(vReg dst_src, vReg src1, vRegMask_V0 v0) %{
+ predicate(n->is_SaturatingVector() && !n->as_SaturatingVector()->is_unsigned());
+ match(Set dst_src (SaturatingAddV (Binary dst_src src1) v0));
+ ins_cost(VEC_COST);
+ format %{ "vsadd_masked $dst_src, $dst_src, $src1, $v0" %}
+ ins_encode %{
+ BasicType bt = Matcher::vector_element_basic_type(this);
+ assert(is_integral_type(bt), "unsupported type");
+ __ vsetvli_helper(bt, Matcher::vector_length(this));
+ __ vsadd_vv(as_VectorRegister($dst_src$$reg), as_VectorRegister($dst_src$$reg),
+ as_VectorRegister($src1$$reg), Assembler::v0_t);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+// vector saturating unsigned integer addition (predicated)
+
+instruct vsaddu_masked(vReg dst_src, vReg src1, vRegMask_V0 v0) %{
+ predicate(n->is_SaturatingVector() && n->as_SaturatingVector()->is_unsigned());
+ match(Set dst_src (SaturatingAddV (Binary dst_src src1) v0));
+ ins_cost(VEC_COST);
+ format %{ "vsaddu_masked $dst_src, $dst_src, $src1, $v0" %}
+ ins_encode %{
+ BasicType bt = Matcher::vector_element_basic_type(this);
+ assert(is_integral_type(bt), "unsupported type");
+ __ vsetvli_helper(bt, Matcher::vector_length(this));
+ __ vsaddu_vv(as_VectorRegister($dst_src$$reg), as_VectorRegister($dst_src$$reg),
+ as_VectorRegister($src1$$reg), Assembler::v0_t);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+// vector saturating signed integer subtraction
+
+instruct vssub(vReg dst, vReg src1, vReg src2) %{
+ predicate(n->is_SaturatingVector() && !n->as_SaturatingVector()->is_unsigned());
+ match(Set dst (SaturatingSubV src1 src2));
+ ins_cost(VEC_COST);
+ format %{ "vssub $dst, $src1, $src2" %}
+ ins_encode %{
+ BasicType bt = Matcher::vector_element_basic_type(this);
+ assert(is_integral_type(bt), "unsupported type");
+ __ vsetvli_helper(bt, Matcher::vector_length(this));
+ __ vssub_vv(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg), as_VectorRegister($src2$$reg));
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+// vector saturating unsigned integer subtraction
+
+instruct vssubu(vReg dst, vReg src1, vReg src2) %{
+ predicate(n->is_SaturatingVector() && n->as_SaturatingVector()->is_unsigned());
+ match(Set dst (SaturatingSubV src1 src2));
+ ins_cost(VEC_COST);
+ format %{ "vssubu $dst, $src1, $src2" %}
+ ins_encode %{
+ BasicType bt = Matcher::vector_element_basic_type(this);
+ assert(is_integral_type(bt), "unsupported type");
+ __ vsetvli_helper(bt, Matcher::vector_length(this));
+ __ vssubu_vv(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg), as_VectorRegister($src2$$reg));
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+// vector saturating signed integer subtraction (predicated)
+
+instruct vssub_masked(vReg dst_src, vReg src1, vRegMask_V0 v0) %{
+ predicate(n->is_SaturatingVector() && !n->as_SaturatingVector()->is_unsigned());
+ match(Set dst_src (SaturatingSubV (Binary dst_src src1) v0));
+ ins_cost(VEC_COST);
+ format %{ "vssub_masked $dst_src, $dst_src, $src1, $v0" %}
+ ins_encode %{
+ BasicType bt = Matcher::vector_element_basic_type(this);
+ assert(is_integral_type(bt), "unsupported type");
+ __ vsetvli_helper(bt, Matcher::vector_length(this));
+ __ vssub_vv(as_VectorRegister($dst_src$$reg), as_VectorRegister($dst_src$$reg),
+ as_VectorRegister($src1$$reg), Assembler::v0_t);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+// vector saturating unsigned integer subtraction (predicated)
+
+instruct vssubu_masked(vReg dst_src, vReg src1, vRegMask_V0 v0) %{
+ predicate(n->is_SaturatingVector() && n->as_SaturatingVector()->is_unsigned());
+ match(Set dst_src (SaturatingSubV (Binary dst_src src1) v0));
+ ins_cost(VEC_COST);
+ format %{ "vssubu_masked $dst_src, $dst_src, $src1, $v0" %}
+ ins_encode %{
+ BasicType bt = Matcher::vector_element_basic_type(this);
+ assert(is_integral_type(bt), "unsupported type");
+ __ vsetvli_helper(bt, Matcher::vector_length(this));
+ __ vssubu_vv(as_VectorRegister($dst_src$$reg), as_VectorRegister($dst_src$$reg),
+ as_VectorRegister($src1$$reg), Assembler::v0_t);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
// vector and
instruct vand(vReg dst, vReg src1, vReg src2) %{
@@ -679,30 +822,30 @@ instruct vand_masked(vReg dst_src1, vReg src2, vRegMask_V0 v0) %{
// vector-immediate and (unpredicated)
-instruct vand_immI(vReg dst_src, immI5 con) %{
- predicate(Matcher::vector_element_basic_type(n) == T_INT ||
- Matcher::vector_element_basic_type(n) == T_BYTE ||
- Matcher::vector_element_basic_type(n) == T_SHORT);
- match(Set dst_src (AndV dst_src (Replicate con)));
- format %{ "vand_immI $dst_src, $dst_src, $con" %}
+instruct vand_vi(vReg dst, vReg src1, immI5 con) %{
+ predicate(Matcher::vector_element_basic_type(n) == T_BYTE ||
+ Matcher::vector_element_basic_type(n) == T_SHORT ||
+ Matcher::vector_element_basic_type(n) == T_INT);
+ match(Set dst (AndV src1 (Replicate con)));
+ format %{ "vand_vi $dst, $src1, $con" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
- __ vand_vi(as_VectorRegister($dst_src$$reg),
- as_VectorRegister($dst_src$$reg),
+ __ vand_vi(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg),
$con$$constant);
%}
ins_pipe(pipe_slow);
%}
-instruct vand_immL(vReg dst_src, immL5 con) %{
+instruct vandL_vi(vReg dst, vReg src1, immL5 con) %{
predicate(Matcher::vector_element_basic_type(n) == T_LONG);
- match(Set dst_src (AndV dst_src (Replicate con)));
- format %{ "vand_immL $dst_src, $dst_src, $con" %}
+ match(Set dst (AndV src1 (Replicate con)));
+ format %{ "vandL_vi $dst, $src1, $con" %}
ins_encode %{
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
- __ vand_vi(as_VectorRegister($dst_src$$reg),
- as_VectorRegister($dst_src$$reg),
+ __ vand_vi(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg),
$con$$constant);
%}
ins_pipe(pipe_slow);
@@ -710,43 +853,43 @@ instruct vand_immL(vReg dst_src, immL5 con) %{
// vector-scalar and (unpredicated)
-instruct vand_regI(vReg dst_src, iRegIorL2I src) %{
- predicate(Matcher::vector_element_basic_type(n) == T_INT ||
- Matcher::vector_element_basic_type(n) == T_BYTE ||
- Matcher::vector_element_basic_type(n) == T_SHORT);
- match(Set dst_src (AndV dst_src (Replicate src)));
- format %{ "vand_regI $dst_src, $dst_src, $src" %}
+instruct vand_vx(vReg dst, vReg src1, iRegIorL2I src2) %{
+ predicate(Matcher::vector_element_basic_type(n) == T_BYTE ||
+ Matcher::vector_element_basic_type(n) == T_SHORT ||
+ Matcher::vector_element_basic_type(n) == T_INT);
+ match(Set dst (AndV src1 (Replicate src2)));
+ format %{ "vand_vx $dst, $src1, $src2" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
- __ vand_vx(as_VectorRegister($dst_src$$reg),
- as_VectorRegister($dst_src$$reg),
- as_Register($src$$reg));
+ __ vand_vx(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg),
+ as_Register($src2$$reg));
%}
ins_pipe(pipe_slow);
%}
-instruct vand_regL(vReg dst_src, iRegL src) %{
+instruct vandL_vx(vReg dst, vReg src1, iRegL src2) %{
predicate(Matcher::vector_element_basic_type(n) == T_LONG);
- match(Set dst_src (AndV dst_src (Replicate src)));
- format %{ "vand_regL $dst_src, $dst_src, $src" %}
+ match(Set dst (AndV src1 (Replicate src2)));
+ format %{ "vandL_vx $dst, $src1, $src2" %}
ins_encode %{
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
- __ vand_vx(as_VectorRegister($dst_src$$reg),
- as_VectorRegister($dst_src$$reg),
- as_Register($src$$reg));
+ __ vand_vx(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg),
+ as_Register($src2$$reg));
%}
ins_pipe(pipe_slow);
%}
// vector-immediate and (predicated)
-instruct vand_immI_masked(vReg dst_src, immI5 con, vRegMask_V0 v0) %{
- predicate(Matcher::vector_element_basic_type(n) == T_INT ||
- Matcher::vector_element_basic_type(n) == T_BYTE ||
- Matcher::vector_element_basic_type(n) == T_SHORT);
+instruct vand_vi_masked(vReg dst_src, immI5 con, vRegMask_V0 v0) %{
+ predicate(Matcher::vector_element_basic_type(n) == T_BYTE ||
+ Matcher::vector_element_basic_type(n) == T_SHORT ||
+ Matcher::vector_element_basic_type(n) == T_INT);
match(Set dst_src (AndV (Binary dst_src (Replicate con)) v0));
- format %{ "vand_immI_masked $dst_src, $dst_src, $con" %}
+ format %{ "vand_vi_masked $dst_src, $dst_src, $con, $v0" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -757,10 +900,10 @@ instruct vand_immI_masked(vReg dst_src, immI5 con, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vand_immL_masked(vReg dst_src, immL5 con, vRegMask_V0 v0) %{
+instruct vandL_vi_masked(vReg dst_src, immL5 con, vRegMask_V0 v0) %{
predicate(Matcher::vector_element_basic_type(n) == T_LONG);
match(Set dst_src (AndV (Binary dst_src (Replicate con)) v0));
- format %{ "vand_immL_masked $dst_src, $dst_src, $con" %}
+ format %{ "vandL_vi_masked $dst_src, $dst_src, $con, $v0" %}
ins_encode %{
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
__ vand_vi(as_VectorRegister($dst_src$$reg),
@@ -772,12 +915,12 @@ instruct vand_immL_masked(vReg dst_src, immL5 con, vRegMask_V0 v0) %{
// vector-scalar and (predicated)
-instruct vand_regI_masked(vReg dst_src, iRegIorL2I src, vRegMask_V0 v0) %{
- predicate(Matcher::vector_element_basic_type(n) == T_INT ||
- Matcher::vector_element_basic_type(n) == T_BYTE ||
- Matcher::vector_element_basic_type(n) == T_SHORT);
+instruct vand_vx_masked(vReg dst_src, iRegIorL2I src, vRegMask_V0 v0) %{
+ predicate(Matcher::vector_element_basic_type(n) == T_BYTE ||
+ Matcher::vector_element_basic_type(n) == T_SHORT ||
+ Matcher::vector_element_basic_type(n) == T_INT);
match(Set dst_src (AndV (Binary dst_src (Replicate src)) v0));
- format %{ "vand_regI_masked $dst_src, $dst_src, $src" %}
+ format %{ "vand_vx_masked $dst_src, $dst_src, $src, $v0" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -788,10 +931,10 @@ instruct vand_regI_masked(vReg dst_src, iRegIorL2I src, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vand_regL_masked(vReg dst_src, iRegL src, vRegMask_V0 v0) %{
+instruct vandL_vx_masked(vReg dst_src, iRegL src, vRegMask_V0 v0) %{
predicate(Matcher::vector_element_basic_type(n) == T_LONG);
match(Set dst_src (AndV (Binary dst_src (Replicate src)) v0));
- format %{ "vand_regL_masked $dst_src, $dst_src, $src" %}
+ format %{ "vandL_vx_masked $dst_src, $dst_src, $src, $v0" %}
ins_encode %{
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
__ vand_vx(as_VectorRegister($dst_src$$reg),
@@ -835,30 +978,30 @@ instruct vor_masked(vReg dst_src1, vReg src2, vRegMask_V0 v0) %{
// vector-immediate or (unpredicated)
-instruct vor_immI(vReg dst_src, immI5 con) %{
- predicate(Matcher::vector_element_basic_type(n) == T_INT ||
- Matcher::vector_element_basic_type(n) == T_BYTE ||
- Matcher::vector_element_basic_type(n) == T_SHORT);
- match(Set dst_src (OrV dst_src (Replicate con)));
- format %{ "vor_immI $dst_src, $dst_src, $con" %}
+instruct vor_vi(vReg dst, vReg src1, immI5 con) %{
+ predicate(Matcher::vector_element_basic_type(n) == T_BYTE ||
+ Matcher::vector_element_basic_type(n) == T_SHORT ||
+ Matcher::vector_element_basic_type(n) == T_INT);
+ match(Set dst (OrV src1 (Replicate con)));
+ format %{ "vor_vi $dst, $src1, $con" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
- __ vor_vi(as_VectorRegister($dst_src$$reg),
- as_VectorRegister($dst_src$$reg),
+ __ vor_vi(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg),
$con$$constant);
%}
ins_pipe(pipe_slow);
%}
-instruct vor_immL(vReg dst_src, immL5 con) %{
+instruct vorL_vi(vReg dst, vReg src1, immL5 con) %{
predicate(Matcher::vector_element_basic_type(n) == T_LONG);
- match(Set dst_src (OrV dst_src (Replicate con)));
- format %{ "vor_immL $dst_src, $dst_src, $con" %}
+ match(Set dst (OrV src1 (Replicate con)));
+ format %{ "vorL_vi $dst, $src1, $con" %}
ins_encode %{
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
- __ vor_vi(as_VectorRegister($dst_src$$reg),
- as_VectorRegister($dst_src$$reg),
+ __ vor_vi(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg),
$con$$constant);
%}
ins_pipe(pipe_slow);
@@ -866,43 +1009,43 @@ instruct vor_immL(vReg dst_src, immL5 con) %{
// vector-scalar or (unpredicated)
-instruct vor_regI(vReg dst_src, iRegIorL2I src) %{
- predicate(Matcher::vector_element_basic_type(n) == T_INT ||
- Matcher::vector_element_basic_type(n) == T_BYTE ||
- Matcher::vector_element_basic_type(n) == T_SHORT);
- match(Set dst_src (OrV dst_src (Replicate src)));
- format %{ "vor_regI $dst_src, $dst_src, $src" %}
+instruct vor_vx(vReg dst, vReg src1, iRegIorL2I src2) %{
+ predicate(Matcher::vector_element_basic_type(n) == T_BYTE ||
+ Matcher::vector_element_basic_type(n) == T_SHORT ||
+ Matcher::vector_element_basic_type(n) == T_INT);
+ match(Set dst (OrV src1 (Replicate src2)));
+ format %{ "vor_vx $dst, $src1, $src2" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
- __ vor_vx(as_VectorRegister($dst_src$$reg),
- as_VectorRegister($dst_src$$reg),
- as_Register($src$$reg));
+ __ vor_vx(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg),
+ as_Register($src2$$reg));
%}
ins_pipe(pipe_slow);
%}
-instruct vor_regL(vReg dst_src, iRegL src) %{
+instruct vorL_vx(vReg dst, vReg src1, iRegL src2) %{
predicate(Matcher::vector_element_basic_type(n) == T_LONG);
- match(Set dst_src (OrV dst_src (Replicate src)));
- format %{ "vor_regL $dst_src, $dst_src, $src" %}
+ match(Set dst (OrV src1 (Replicate src2)));
+ format %{ "vorL_vx $dst, $src1, $src2" %}
ins_encode %{
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
- __ vor_vx(as_VectorRegister($dst_src$$reg),
- as_VectorRegister($dst_src$$reg),
- as_Register($src$$reg));
+ __ vor_vx(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg),
+ as_Register($src2$$reg));
%}
ins_pipe(pipe_slow);
%}
// vector-immediate or (predicated)
-instruct vor_immI_masked(vReg dst_src, immI5 con, vRegMask_V0 v0) %{
- predicate(Matcher::vector_element_basic_type(n) == T_INT ||
- Matcher::vector_element_basic_type(n) == T_BYTE ||
- Matcher::vector_element_basic_type(n) == T_SHORT);
+instruct vor_vi_masked(vReg dst_src, immI5 con, vRegMask_V0 v0) %{
+ predicate(Matcher::vector_element_basic_type(n) == T_BYTE ||
+ Matcher::vector_element_basic_type(n) == T_SHORT ||
+ Matcher::vector_element_basic_type(n) == T_INT);
match(Set dst_src (OrV (Binary dst_src (Replicate con)) v0));
- format %{ "vor_immI_masked $dst_src, $dst_src, $con" %}
+ format %{ "vor_vi_masked $dst_src, $dst_src, $con, $v0" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -913,10 +1056,10 @@ instruct vor_immI_masked(vReg dst_src, immI5 con, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vor_immL_masked(vReg dst_src, immL5 con, vRegMask_V0 v0) %{
+instruct vorL_vi_masked(vReg dst_src, immL5 con, vRegMask_V0 v0) %{
predicate(Matcher::vector_element_basic_type(n) == T_LONG);
match(Set dst_src (OrV (Binary dst_src (Replicate con)) v0));
- format %{ "vor_immL_masked $dst_src, $dst_src, $con" %}
+ format %{ "vorL_vi_masked $dst_src, $dst_src, $con, $v0" %}
ins_encode %{
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
__ vor_vi(as_VectorRegister($dst_src$$reg),
@@ -928,12 +1071,12 @@ instruct vor_immL_masked(vReg dst_src, immL5 con, vRegMask_V0 v0) %{
// vector-scalar or (predicated)
-instruct vor_regI_masked(vReg dst_src, iRegIorL2I src, vRegMask_V0 v0) %{
- predicate(Matcher::vector_element_basic_type(n) == T_INT ||
- Matcher::vector_element_basic_type(n) == T_BYTE ||
- Matcher::vector_element_basic_type(n) == T_SHORT);
+instruct vor_vx_masked(vReg dst_src, iRegIorL2I src, vRegMask_V0 v0) %{
+ predicate(Matcher::vector_element_basic_type(n) == T_BYTE ||
+ Matcher::vector_element_basic_type(n) == T_SHORT ||
+ Matcher::vector_element_basic_type(n) == T_INT);
match(Set dst_src (OrV (Binary dst_src (Replicate src)) v0));
- format %{ "vor_regI_masked $dst_src, $dst_src, $src" %}
+ format %{ "vor_vx_masked $dst_src, $dst_src, $src, $v0" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -944,10 +1087,10 @@ instruct vor_regI_masked(vReg dst_src, iRegIorL2I src, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vor_regL_masked(vReg dst_src, iRegL src, vRegMask_V0 v0) %{
+instruct vorL_vx_masked(vReg dst_src, iRegL src, vRegMask_V0 v0) %{
predicate(Matcher::vector_element_basic_type(n) == T_LONG);
match(Set dst_src (OrV (Binary dst_src (Replicate src)) v0));
- format %{ "vor_regL_masked $dst_src, $dst_src, $src" %}
+ format %{ "vorL_vx_masked $dst_src, $dst_src, $src, $v0" %}
ins_encode %{
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
__ vor_vx(as_VectorRegister($dst_src$$reg),
@@ -991,30 +1134,30 @@ instruct vxor_masked(vReg dst_src1, vReg src2, vRegMask_V0 v0) %{
// vector-immediate xor (unpredicated)
-instruct vxor_immI(vReg dst_src, immI5 con) %{
- predicate(Matcher::vector_element_basic_type(n) == T_INT ||
- Matcher::vector_element_basic_type(n) == T_BYTE ||
- Matcher::vector_element_basic_type(n) == T_SHORT);
- match(Set dst_src (XorV dst_src (Replicate con)));
- format %{ "vxor_immI $dst_src, $dst_src, $con" %}
+instruct vxor_vi(vReg dst, vReg src1, immI5 con) %{
+ predicate(Matcher::vector_element_basic_type(n) == T_BYTE ||
+ Matcher::vector_element_basic_type(n) == T_SHORT ||
+ Matcher::vector_element_basic_type(n) == T_INT);
+ match(Set dst (XorV src1 (Replicate con)));
+ format %{ "vxor_vi $dst, $src1, $con" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
- __ vxor_vi(as_VectorRegister($dst_src$$reg),
- as_VectorRegister($dst_src$$reg),
+ __ vxor_vi(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg),
$con$$constant);
%}
ins_pipe(pipe_slow);
%}
-instruct vxor_immL(vReg dst_src, immL5 con) %{
+instruct vxorL_vi(vReg dst, vReg src1, immL5 con) %{
predicate(Matcher::vector_element_basic_type(n) == T_LONG);
- match(Set dst_src (XorV dst_src (Replicate con)));
- format %{ "vxor_immL $dst_src, $dst_src, $con" %}
+ match(Set dst (XorV src1 (Replicate con)));
+ format %{ "vxorL_vi $dst, $src1, $con" %}
ins_encode %{
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
- __ vxor_vi(as_VectorRegister($dst_src$$reg),
- as_VectorRegister($dst_src$$reg),
+ __ vxor_vi(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg),
$con$$constant);
%}
ins_pipe(pipe_slow);
@@ -1022,43 +1165,43 @@ instruct vxor_immL(vReg dst_src, immL5 con) %{
// vector-scalar xor (unpredicated)
-instruct vxor_regI(vReg dst_src, iRegIorL2I src) %{
- predicate(Matcher::vector_element_basic_type(n) == T_INT ||
- Matcher::vector_element_basic_type(n) == T_BYTE ||
- Matcher::vector_element_basic_type(n) == T_SHORT);
- match(Set dst_src (XorV dst_src (Replicate src)));
- format %{ "vxor_regI $dst_src, $dst_src, $src" %}
+instruct vxor_vx(vReg dst, vReg src1, iRegIorL2I src2) %{
+ predicate(Matcher::vector_element_basic_type(n) == T_BYTE ||
+ Matcher::vector_element_basic_type(n) == T_SHORT ||
+ Matcher::vector_element_basic_type(n) == T_INT);
+ match(Set dst (XorV src1 (Replicate src2)));
+ format %{ "vxor_vx $dst, $src1, $src2" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
- __ vxor_vx(as_VectorRegister($dst_src$$reg),
- as_VectorRegister($dst_src$$reg),
- as_Register($src$$reg));
+ __ vxor_vx(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg),
+ as_Register($src2$$reg));
%}
ins_pipe(pipe_slow);
%}
-instruct vxor_regL(vReg dst_src, iRegL src) %{
+instruct vxorL_vx(vReg dst, vReg src1, iRegL src2) %{
predicate(Matcher::vector_element_basic_type(n) == T_LONG);
- match(Set dst_src (XorV dst_src (Replicate src)));
- format %{ "vxor_regL $dst_src, $dst_src, $src" %}
+ match(Set dst (XorV src1 (Replicate src2)));
+ format %{ "vxorL_vx $dst, $src1, $src2" %}
ins_encode %{
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
- __ vxor_vx(as_VectorRegister($dst_src$$reg),
- as_VectorRegister($dst_src$$reg),
- as_Register($src$$reg));
+ __ vxor_vx(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg),
+ as_Register($src2$$reg));
%}
ins_pipe(pipe_slow);
%}
// vector-immediate xor (predicated)
-instruct vxor_immI_masked(vReg dst_src, immI5 con, vRegMask_V0 v0) %{
- predicate(Matcher::vector_element_basic_type(n) == T_INT ||
- Matcher::vector_element_basic_type(n) == T_BYTE ||
- Matcher::vector_element_basic_type(n) == T_SHORT);
+instruct vxor_vi_masked(vReg dst_src, immI5 con, vRegMask_V0 v0) %{
+ predicate(Matcher::vector_element_basic_type(n) == T_BYTE ||
+ Matcher::vector_element_basic_type(n) == T_SHORT ||
+ Matcher::vector_element_basic_type(n) == T_INT);
match(Set dst_src (XorV (Binary dst_src (Replicate con)) v0));
- format %{ "vxor_immI_masked $dst_src, $dst_src, $con" %}
+ format %{ "vxor_vi_masked $dst_src, $dst_src, $con, $v0" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -1069,10 +1212,10 @@ instruct vxor_immI_masked(vReg dst_src, immI5 con, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vxor_immL_masked(vReg dst_src, immL5 con, vRegMask_V0 v0) %{
+instruct vxorL_vi_masked(vReg dst_src, immL5 con, vRegMask_V0 v0) %{
predicate(Matcher::vector_element_basic_type(n) == T_LONG);
match(Set dst_src (XorV (Binary dst_src (Replicate con)) v0));
- format %{ "vxor_immL_masked $dst_src, $dst_src, $con" %}
+ format %{ "vxorL_vi_masked $dst_src, $dst_src, $con, $v0" %}
ins_encode %{
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
__ vxor_vi(as_VectorRegister($dst_src$$reg),
@@ -1084,12 +1227,12 @@ instruct vxor_immL_masked(vReg dst_src, immL5 con, vRegMask_V0 v0) %{
// vector-scalar xor (predicated)
-instruct vxor_regI_masked(vReg dst_src, iRegIorL2I src, vRegMask_V0 v0) %{
- predicate(Matcher::vector_element_basic_type(n) == T_INT ||
- Matcher::vector_element_basic_type(n) == T_BYTE ||
- Matcher::vector_element_basic_type(n) == T_SHORT);
+instruct vxor_vx_masked(vReg dst_src, iRegIorL2I src, vRegMask_V0 v0) %{
+ predicate(Matcher::vector_element_basic_type(n) == T_BYTE ||
+ Matcher::vector_element_basic_type(n) == T_SHORT ||
+ Matcher::vector_element_basic_type(n) == T_INT);
match(Set dst_src (XorV (Binary dst_src (Replicate src)) v0));
- format %{ "vxor_regI_masked $dst_src, $dst_src, $src" %}
+ format %{ "vxor_vx_masked $dst_src, $dst_src, $src, $v0" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -1100,10 +1243,10 @@ instruct vxor_regI_masked(vReg dst_src, iRegIorL2I src, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vxor_regL_masked(vReg dst_src, iRegL src, vRegMask_V0 v0) %{
+instruct vxorL_vx_masked(vReg dst_src, iRegL src, vRegMask_V0 v0) %{
predicate(Matcher::vector_element_basic_type(n) == T_LONG);
match(Set dst_src (XorV (Binary dst_src (Replicate src)) v0));
- format %{ "vxor_regL_masked $dst_src, $dst_src, $src" %}
+ format %{ "vxorL_vx_masked $dst_src, $dst_src, $src, $v0" %}
ins_encode %{
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
__ vxor_vx(as_VectorRegister($dst_src$$reg),
@@ -1113,16 +1256,236 @@ instruct vxor_regL_masked(vReg dst_src, iRegL src, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
+// ------------------------------ Vector and not -----------------------------------
+
+// vector and not
+
+instruct vand_notB(vReg dst, vReg src1, vReg src2, immI_M1 m1) %{
+ predicate(UseZvbb && Matcher::vector_element_basic_type(n) == T_BYTE);
+ match(Set dst (AndV src1 (XorV src2 (Replicate m1))));
+ format %{ "vand_notB $dst, $src1, $src2" %}
+ ins_encode %{
+ __ vsetvli_helper(T_BYTE, Matcher::vector_length(this));
+ __ vandn_vv(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg),
+ as_VectorRegister($src2$$reg));
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct vand_notS(vReg dst, vReg src1, vReg src2, immI_M1 m1) %{
+ predicate(UseZvbb && Matcher::vector_element_basic_type(n) == T_SHORT);
+ match(Set dst (AndV src1 (XorV src2 (Replicate m1))));
+ format %{ "vand_notS $dst, $src1, $src2" %}
+ ins_encode %{
+ __ vsetvli_helper(T_SHORT, Matcher::vector_length(this));
+ __ vandn_vv(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg),
+ as_VectorRegister($src2$$reg));
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct vand_notI(vReg dst, vReg src1, vReg src2, immI_M1 m1) %{
+ predicate(UseZvbb && Matcher::vector_element_basic_type(n) == T_INT);
+ match(Set dst (AndV src1 (XorV src2 (Replicate m1))));
+ format %{ "vand_notI $dst, $src1, $src2" %}
+ ins_encode %{
+ __ vsetvli_helper(T_INT, Matcher::vector_length(this));
+ __ vandn_vv(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg),
+ as_VectorRegister($src2$$reg));
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct vand_notL(vReg dst, vReg src1, vReg src2, immL_M1 m1) %{
+ predicate(UseZvbb && Matcher::vector_element_basic_type(n) == T_LONG);
+ match(Set dst (AndV src1 (XorV src2 (Replicate m1))));
+ format %{ "vand_notL $dst, $src1, $src2" %}
+ ins_encode %{
+ __ vsetvli_helper(T_LONG, Matcher::vector_length(this));
+ __ vandn_vv(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg),
+ as_VectorRegister($src2$$reg));
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct vand_notB_masked(vReg dst_src1, vReg src2, immI_M1 m1, vRegMask_V0 v0) %{
+ predicate(UseZvbb && Matcher::vector_element_basic_type(n) == T_BYTE);
+ match(Set dst_src1 (AndV (Binary dst_src1 (XorV src2 (Replicate m1))) v0));
+ format %{ "vand_notB_masked $dst_src1, $dst_src1, $src2, $v0" %}
+ ins_encode %{
+ __ vsetvli_helper(T_BYTE, Matcher::vector_length(this));
+ __ vandn_vv(as_VectorRegister($dst_src1$$reg),
+ as_VectorRegister($dst_src1$$reg),
+ as_VectorRegister($src2$$reg),
+ Assembler::v0_t);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct vand_notS_masked(vReg dst_src1, vReg src2, immI_M1 m1, vRegMask_V0 v0) %{
+ predicate(UseZvbb && Matcher::vector_element_basic_type(n) == T_SHORT);
+ match(Set dst_src1 (AndV (Binary dst_src1 (XorV src2 (Replicate m1))) v0));
+ format %{ "vand_notS_masked $dst_src1, $dst_src1, $src2, $v0" %}
+ ins_encode %{
+ __ vsetvli_helper(T_SHORT, Matcher::vector_length(this));
+ __ vandn_vv(as_VectorRegister($dst_src1$$reg),
+ as_VectorRegister($dst_src1$$reg),
+ as_VectorRegister($src2$$reg),
+ Assembler::v0_t);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct vand_notI_masked(vReg dst_src1, vReg src2, immI_M1 m1, vRegMask_V0 v0) %{
+ predicate(UseZvbb && Matcher::vector_element_basic_type(n) == T_INT);
+ match(Set dst_src1 (AndV (Binary dst_src1 (XorV src2 (Replicate m1))) v0));
+ format %{ "vand_notI_masked $dst_src1, $dst_src1, $src2, $v0" %}
+ ins_encode %{
+ __ vsetvli_helper(T_INT, Matcher::vector_length(this));
+ __ vandn_vv(as_VectorRegister($dst_src1$$reg),
+ as_VectorRegister($dst_src1$$reg),
+ as_VectorRegister($src2$$reg),
+ Assembler::v0_t);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct vand_notL_masked(vReg dst_src1, vReg src2, immL_M1 m1, vRegMask_V0 v0) %{
+ predicate(UseZvbb && Matcher::vector_element_basic_type(n) == T_LONG);
+ match(Set dst_src1 (AndV (Binary dst_src1 (XorV src2 (Replicate m1))) v0));
+ format %{ "vand_notL_masked $dst_src1, $dst_src1, $src2, $v0" %}
+ ins_encode %{
+ __ vsetvli_helper(T_LONG, Matcher::vector_length(this));
+ __ vandn_vv(as_VectorRegister($dst_src1$$reg),
+ as_VectorRegister($dst_src1$$reg),
+ as_VectorRegister($src2$$reg),
+ Assembler::v0_t);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct vand_notB_vx(vReg dst, vReg src1, iRegIorL2I src2, immI_M1 m1) %{
+ predicate(UseZvbb && Matcher::vector_element_basic_type(n) == T_BYTE);
+ match(Set dst (AndV src1 (Replicate (XorI src2 m1))));
+ format %{ "vand_notB_vx $dst, $src1, $src2" %}
+ ins_encode %{
+ __ vsetvli_helper(T_BYTE, Matcher::vector_length(this));
+ __ vandn_vx(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg),
+ as_Register($src2$$reg));
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct vand_notS_vx(vReg dst, vReg src1, iRegIorL2I src2, immI_M1 m1) %{
+ predicate(UseZvbb && Matcher::vector_element_basic_type(n) == T_SHORT);
+ match(Set dst (AndV src1 (Replicate (XorI src2 m1))));
+ format %{ "vand_notS_vx $dst, $src1, $src2" %}
+ ins_encode %{
+ __ vsetvli_helper(T_SHORT, Matcher::vector_length(this));
+ __ vandn_vx(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg),
+ as_Register($src2$$reg));
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct vand_notI_vx(vReg dst, vReg src1, iRegIorL2I src2, immI_M1 m1) %{
+ predicate(UseZvbb && Matcher::vector_element_basic_type(n) == T_INT);
+ match(Set dst (AndV src1 (Replicate (XorI src2 m1))));
+ format %{ "vand_notI_vx $dst, $src1, $src2" %}
+ ins_encode %{
+ __ vsetvli_helper(T_INT, Matcher::vector_length(this));
+ __ vandn_vx(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg),
+ as_Register($src2$$reg));
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct vand_notL_vx(vReg dst, vReg src1, iRegL src2, immL_M1 m1) %{
+ predicate(UseZvbb && Matcher::vector_element_basic_type(n) == T_LONG);
+ match(Set dst (AndV src1 (Replicate (XorL src2 m1))));
+ format %{ "vand_notL_vx $dst, $src1, $src2" %}
+ ins_encode %{
+ __ vsetvli_helper(T_LONG, Matcher::vector_length(this));
+ __ vandn_vx(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg),
+ as_Register($src2$$reg));
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct vand_notB_vx_masked(vReg dst_src1, iRegIorL2I src2, immI_M1 m1, vRegMask_V0 v0) %{
+ predicate(UseZvbb && Matcher::vector_element_basic_type(n) == T_BYTE);
+ match(Set dst_src1 (AndV (Binary dst_src1 (Replicate (XorI src2 m1))) v0));
+ format %{ "vand_notB_vx_masked $dst_src1, $dst_src1, $src2, $v0" %}
+ ins_encode %{
+ __ vsetvli_helper(T_BYTE, Matcher::vector_length(this));
+ __ vandn_vx(as_VectorRegister($dst_src1$$reg),
+ as_VectorRegister($dst_src1$$reg),
+ as_Register($src2$$reg),
+ Assembler::v0_t);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct vand_notS_vx_masked(vReg dst_src1, iRegIorL2I src2, immI_M1 m1, vRegMask_V0 v0) %{
+ predicate(UseZvbb && Matcher::vector_element_basic_type(n) == T_SHORT);
+ match(Set dst_src1 (AndV (Binary dst_src1 (Replicate (XorI src2 m1))) v0));
+ format %{ "vand_notS_vx_masked $dst_src1, $dst_src1, $src2, $v0" %}
+ ins_encode %{
+ __ vsetvli_helper(T_SHORT, Matcher::vector_length(this));
+ __ vandn_vx(as_VectorRegister($dst_src1$$reg),
+ as_VectorRegister($dst_src1$$reg),
+ as_Register($src2$$reg),
+ Assembler::v0_t);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct vand_notI_vx_masked(vReg dst_src1, iRegIorL2I src2, immI_M1 m1, vRegMask_V0 v0) %{
+ predicate(UseZvbb && Matcher::vector_element_basic_type(n) == T_INT);
+ match(Set dst_src1 (AndV (Binary dst_src1 (Replicate (XorI src2 m1))) v0));
+ format %{ "vand_notI_vx_masked $dst_src1, $dst_src1, $src2, $v0" %}
+ ins_encode %{
+ __ vsetvli_helper(T_INT, Matcher::vector_length(this));
+ __ vandn_vx(as_VectorRegister($dst_src1$$reg),
+ as_VectorRegister($dst_src1$$reg),
+ as_Register($src2$$reg),
+ Assembler::v0_t);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct vand_notL_vx_masked(vReg dst_src1, iRegL src2, immL_M1 m1, vRegMask_V0 v0) %{
+ predicate(UseZvbb && Matcher::vector_element_basic_type(n) == T_LONG);
+ match(Set dst_src1 (AndV (Binary dst_src1 (Replicate (XorL src2 m1))) v0));
+ format %{ "vand_notL_vx_masked $dst_src1, $dst_src1, $src2, $v0" %}
+ ins_encode %{
+ __ vsetvli_helper(T_LONG, Matcher::vector_length(this));
+ __ vandn_vx(as_VectorRegister($dst_src1$$reg),
+ as_VectorRegister($dst_src1$$reg),
+ as_Register($src2$$reg),
+ Assembler::v0_t);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
// ------------------------------ Vector not -----------------------------------
// vector not
-instruct vnotI(vReg dst, vReg src, immI_M1 m1) %{
- predicate(Matcher::vector_element_basic_type(n) == T_INT ||
- Matcher::vector_element_basic_type(n) == T_BYTE ||
- Matcher::vector_element_basic_type(n) == T_SHORT);
+instruct vnot(vReg dst, vReg src, immI_M1 m1) %{
+ predicate(Matcher::vector_element_basic_type(n) == T_BYTE ||
+ Matcher::vector_element_basic_type(n) == T_SHORT ||
+ Matcher::vector_element_basic_type(n) == T_INT);
match(Set dst (XorV src (Replicate m1)));
- format %{ "vnotI $dst, $src" %}
+ format %{ "vnot $dst, $src" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -1148,12 +1511,12 @@ instruct vnotL(vReg dst, vReg src, immL_M1 m1) %{
// vector not - predicated
-instruct vnotI_masked(vReg dst_src, immI_M1 m1, vRegMask_V0 v0) %{
- predicate(Matcher::vector_element_basic_type(n) == T_INT ||
- Matcher::vector_element_basic_type(n) == T_BYTE ||
- Matcher::vector_element_basic_type(n) == T_SHORT);
+instruct vnot_masked(vReg dst_src, immI_M1 m1, vRegMask_V0 v0) %{
+ predicate(Matcher::vector_element_basic_type(n) == T_BYTE ||
+ Matcher::vector_element_basic_type(n) == T_SHORT ||
+ Matcher::vector_element_basic_type(n) == T_INT);
match(Set dst_src (XorV (Binary dst_src (Replicate m1)) v0));
- format %{ "vnotI_masked $dst_src, $dst_src, $v0" %}
+ format %{ "vnot_masked $dst_src, $dst_src, $v0" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -1275,6 +1638,66 @@ instruct vmin_masked(vReg dst_src1, vReg src2, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
+// vector unsigned integer max/min
+
+instruct vmaxu(vReg dst, vReg src1, vReg src2) %{
+ match(Set dst (UMaxV src1 src2));
+ ins_cost(VEC_COST);
+ format %{ "vmaxu $dst, $src1, $src2" %}
+ ins_encode %{
+ BasicType bt = Matcher::vector_element_basic_type(this);
+ assert(is_integral_type(bt), "unsupported type");
+ __ vsetvli_helper(bt, Matcher::vector_length(this));
+ __ vmaxu_vv(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg), as_VectorRegister($src2$$reg));
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct vminu(vReg dst, vReg src1, vReg src2) %{
+ match(Set dst (UMinV src1 src2));
+ ins_cost(VEC_COST);
+ format %{ "vminu $dst, $src1, $src2" %}
+ ins_encode %{
+ BasicType bt = Matcher::vector_element_basic_type(this);
+ assert(is_integral_type(bt), "unsupported type");
+ __ vsetvli_helper(bt, Matcher::vector_length(this));
+ __ vminu_vv(as_VectorRegister($dst$$reg),
+ as_VectorRegister($src1$$reg), as_VectorRegister($src2$$reg));
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+// vector unsigned integer max/min - predicated
+
+instruct vmaxu_masked(vReg dst_src1, vReg src2, vRegMask_V0 v0) %{
+ match(Set dst_src1 (UMaxV (Binary dst_src1 src2) v0));
+ ins_cost(VEC_COST);
+ format %{ "vmaxu_masked $dst_src1, $dst_src1, $src2, $v0" %}
+ ins_encode %{
+ BasicType bt = Matcher::vector_element_basic_type(this);
+ assert(is_integral_type(bt), "unsupported type");
+ __ vsetvli_helper(bt, Matcher::vector_length(this));
+ __ vmaxu_vv(as_VectorRegister($dst_src1$$reg), as_VectorRegister($dst_src1$$reg),
+ as_VectorRegister($src2$$reg), Assembler::v0_t);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct vminu_masked(vReg dst_src1, vReg src2, vRegMask_V0 v0) %{
+ match(Set dst_src1 (UMinV (Binary dst_src1 src2) v0));
+ ins_cost(VEC_COST);
+ format %{ "vminu_masked $dst_src1, $dst_src1, $src2, $v0" %}
+ ins_encode %{
+ BasicType bt = Matcher::vector_element_basic_type(this);
+ assert(is_integral_type(bt), "unsupported type");
+ __ vsetvli_helper(bt, Matcher::vector_length(this));
+ __ vminu_vv(as_VectorRegister($dst_src1$$reg), as_VectorRegister($dst_src1$$reg),
+ as_VectorRegister($src2$$reg), Assembler::v0_t);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
// vector float-point max/min
instruct vmax_fp(vReg dst, vReg src1, vReg src2, vRegMask_V0 v0) %{
@@ -1663,11 +2086,11 @@ instruct vmul_fp_masked(vReg dst_src1, vReg src2, vRegMask_V0 v0) %{
// vector-scalar mul (unpredicated)
-instruct vmul_regI(vReg dst, vReg src1, iRegIorL2I src2) %{
+instruct vmul_vx(vReg dst, vReg src1, iRegIorL2I src2) %{
match(Set dst (MulVB src1 (Replicate src2)));
match(Set dst (MulVS src1 (Replicate src2)));
match(Set dst (MulVI src1 (Replicate src2)));
- format %{ "vmul_regI $dst, $src1, $src2" %}
+ format %{ "vmul_vx $dst, $src1, $src2" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -1678,9 +2101,9 @@ instruct vmul_regI(vReg dst, vReg src1, iRegIorL2I src2) %{
ins_pipe(pipe_slow);
%}
-instruct vmul_regL(vReg dst, vReg src1, iRegL src2) %{
+instruct vmulL_vx(vReg dst, vReg src1, iRegL src2) %{
match(Set dst (MulVL src1 (Replicate src2)));
- format %{ "vmul_regL $dst, $src1, $src2" %}
+ format %{ "vmulL_vx $dst, $src1, $src2" %}
ins_encode %{
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
__ vmul_vx(as_VectorRegister($dst$$reg),
@@ -1692,11 +2115,11 @@ instruct vmul_regL(vReg dst, vReg src1, iRegL src2) %{
// vector-scalar mul (predicated)
-instruct vmul_regI_masked(vReg dst_src, iRegIorL2I src2, vRegMask_V0 v0) %{
+instruct vmul_vx_masked(vReg dst_src, iRegIorL2I src2, vRegMask_V0 v0) %{
match(Set dst_src (MulVB (Binary dst_src (Replicate src2)) v0));
match(Set dst_src (MulVS (Binary dst_src (Replicate src2)) v0));
match(Set dst_src (MulVI (Binary dst_src (Replicate src2)) v0));
- format %{ "vmul_regI_masked $dst_src, $dst_src, $src2" %}
+ format %{ "vmul_vx_masked $dst_src, $dst_src, $src2, $v0" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -1707,9 +2130,9 @@ instruct vmul_regI_masked(vReg dst_src, iRegIorL2I src2, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vmul_regL_masked(vReg dst_src, iRegL src2, vRegMask_V0 v0) %{
+instruct vmulL_vx_masked(vReg dst_src, iRegL src2, vRegMask_V0 v0) %{
match(Set dst_src (MulVL (Binary dst_src (Replicate src2)) v0));
- format %{ "vmul_regL_masked $dst_src, $dst_src, $src2" %}
+ format %{ "vmulL_vx_masked $dst_src, $dst_src, $src2, $v0" %}
ins_encode %{
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
__ vmul_vx(as_VectorRegister($dst_src$$reg),
@@ -1783,14 +2206,14 @@ instruct vfneg_masked(vReg dst_src, vRegMask_V0 v0) %{
// vector and reduction
-instruct reduce_andI(iRegINoSp dst, iRegIorL2I src1, vReg src2, vReg tmp) %{
+instruct reduce_and(iRegINoSp dst, iRegIorL2I src1, vReg src2, vReg tmp) %{
predicate(Matcher::vector_element_basic_type(n->in(2)) == T_BYTE ||
Matcher::vector_element_basic_type(n->in(2)) == T_SHORT ||
Matcher::vector_element_basic_type(n->in(2)) == T_INT);
match(Set dst (AndReductionV src1 src2));
effect(TEMP tmp);
ins_cost(VEC_COST);
- format %{ "reduce_andI $dst, $src1, $src2\t# KILL $tmp" %}
+ format %{ "reduce_and $dst, $src1, $src2\t# KILL $tmp" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this, $src2);
__ reduce_integral_v($dst$$Register, $src1$$Register,
@@ -1817,14 +2240,14 @@ instruct reduce_andL(iRegLNoSp dst, iRegL src1, vReg src2, vReg tmp) %{
// vector and reduction - predicated
-instruct reduce_andI_masked(iRegINoSp dst, iRegIorL2I src1, vReg src2, vRegMask_V0 v0, vReg tmp) %{
+instruct reduce_and_masked(iRegINoSp dst, iRegIorL2I src1, vReg src2, vRegMask_V0 v0, vReg tmp) %{
predicate(Matcher::vector_element_basic_type(n->in(2)) == T_BYTE ||
Matcher::vector_element_basic_type(n->in(2)) == T_SHORT ||
Matcher::vector_element_basic_type(n->in(2)) == T_INT);
match(Set dst (AndReductionV (Binary src1 src2) v0));
effect(TEMP tmp);
ins_cost(VEC_COST);
- format %{ "reduce_andI_masked $dst, $src1, $src2, $v0\t# KILL $tmp" %}
+ format %{ "reduce_and_masked $dst, $src1, $src2, $v0\t# KILL $tmp" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this, $src2);
__ reduce_integral_v($dst$$Register, $src1$$Register,
@@ -1853,14 +2276,14 @@ instruct reduce_andL_masked(iRegLNoSp dst, iRegL src1, vReg src2, vRegMask_V0 v0
// vector or reduction
-instruct reduce_orI(iRegINoSp dst, iRegIorL2I src1, vReg src2, vReg tmp) %{
+instruct reduce_or(iRegINoSp dst, iRegIorL2I src1, vReg src2, vReg tmp) %{
predicate(Matcher::vector_element_basic_type(n->in(2)) == T_BYTE ||
Matcher::vector_element_basic_type(n->in(2)) == T_SHORT ||
Matcher::vector_element_basic_type(n->in(2)) == T_INT);
match(Set dst (OrReductionV src1 src2));
effect(TEMP tmp);
ins_cost(VEC_COST);
- format %{ "reduce_orI $dst, $src1, $src2\t# KILL $tmp" %}
+ format %{ "reduce_or $dst, $src1, $src2\t# KILL $tmp" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this, $src2);
__ reduce_integral_v($dst$$Register, $src1$$Register,
@@ -1887,14 +2310,14 @@ instruct reduce_orL(iRegLNoSp dst, iRegL src1, vReg src2, vReg tmp) %{
// vector or reduction - predicated
-instruct reduce_orI_masked(iRegINoSp dst, iRegIorL2I src1, vReg src2, vRegMask_V0 v0, vReg tmp) %{
+instruct reduce_or_masked(iRegINoSp dst, iRegIorL2I src1, vReg src2, vRegMask_V0 v0, vReg tmp) %{
predicate(Matcher::vector_element_basic_type(n->in(2)) == T_BYTE ||
Matcher::vector_element_basic_type(n->in(2)) == T_SHORT ||
Matcher::vector_element_basic_type(n->in(2)) == T_INT);
match(Set dst (OrReductionV (Binary src1 src2) v0));
effect(TEMP tmp);
ins_cost(VEC_COST);
- format %{ "reduce_orI_masked $dst, $src1, $src2, $v0\t# KILL $tmp" %}
+ format %{ "reduce_or_masked $dst, $src1, $src2, $v0\t# KILL $tmp" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this, $src2);
__ reduce_integral_v($dst$$Register, $src1$$Register,
@@ -1923,14 +2346,14 @@ instruct reduce_orL_masked(iRegLNoSp dst, iRegL src1, vReg src2, vRegMask_V0 v0,
// vector xor reduction
-instruct reduce_xorI(iRegINoSp dst, iRegIorL2I src1, vReg src2, vReg tmp) %{
+instruct reduce_xor(iRegINoSp dst, iRegIorL2I src1, vReg src2, vReg tmp) %{
predicate(Matcher::vector_element_basic_type(n->in(2)) == T_BYTE ||
Matcher::vector_element_basic_type(n->in(2)) == T_SHORT ||
Matcher::vector_element_basic_type(n->in(2)) == T_INT);
match(Set dst (XorReductionV src1 src2));
effect(TEMP tmp);
ins_cost(VEC_COST);
- format %{ "reduce_xorI $dst, $src1, $src2\t# KILL $tmp" %}
+ format %{ "reduce_xor $dst, $src1, $src2\t# KILL $tmp" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this, $src2);
__ reduce_integral_v($dst$$Register, $src1$$Register,
@@ -1957,14 +2380,14 @@ instruct reduce_xorL(iRegLNoSp dst, iRegL src1, vReg src2, vReg tmp) %{
// vector xor reduction - predicated
-instruct reduce_xorI_masked(iRegINoSp dst, iRegIorL2I src1, vReg src2, vRegMask_V0 v0, vReg tmp) %{
+instruct reduce_xor_masked(iRegINoSp dst, iRegIorL2I src1, vReg src2, vRegMask_V0 v0, vReg tmp) %{
predicate(Matcher::vector_element_basic_type(n->in(2)) == T_BYTE ||
Matcher::vector_element_basic_type(n->in(2)) == T_SHORT ||
Matcher::vector_element_basic_type(n->in(2)) == T_INT);
match(Set dst (XorReductionV (Binary src1 src2) v0));
effect(TEMP tmp);
ins_cost(VEC_COST);
- format %{ "reduce_xorI_masked $dst, $src1, $src2, $v0\t# KILL $tmp" %}
+ format %{ "reduce_xor_masked $dst, $src1, $src2, $v0\t# KILL $tmp" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this, $src2);
__ reduce_integral_v($dst$$Register, $src1$$Register,
@@ -1993,14 +2416,14 @@ instruct reduce_xorL_masked(iRegLNoSp dst, iRegL src1, vReg src2, vRegMask_V0 v0
// vector add reduction
-instruct reduce_addI(iRegINoSp dst, iRegIorL2I src1, vReg src2, vReg tmp) %{
+instruct reduce_add(iRegINoSp dst, iRegIorL2I src1, vReg src2, vReg tmp) %{
predicate(Matcher::vector_element_basic_type(n->in(2)) == T_BYTE ||
Matcher::vector_element_basic_type(n->in(2)) == T_SHORT ||
Matcher::vector_element_basic_type(n->in(2)) == T_INT);
match(Set dst (AddReductionVI src1 src2));
effect(TEMP tmp);
ins_cost(VEC_COST);
- format %{ "reduce_addI $dst, $src1, $src2\t# KILL $tmp" %}
+ format %{ "reduce_add $dst, $src1, $src2\t# KILL $tmp" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this, $src2);
__ reduce_integral_v($dst$$Register, $src1$$Register,
@@ -2099,14 +2522,14 @@ instruct reduce_addD_unordered(fRegD dst, fRegD src1, vReg src2, vReg tmp) %{
// vector add reduction - predicated
-instruct reduce_addI_masked(iRegINoSp dst, iRegIorL2I src1, vReg src2, vRegMask_V0 v0, vReg tmp) %{
+instruct reduce_add_masked(iRegINoSp dst, iRegIorL2I src1, vReg src2, vRegMask_V0 v0, vReg tmp) %{
predicate(Matcher::vector_element_basic_type(n->in(2)) == T_BYTE ||
Matcher::vector_element_basic_type(n->in(2)) == T_SHORT ||
Matcher::vector_element_basic_type(n->in(2)) == T_INT);
match(Set dst (AddReductionVI (Binary src1 src2) v0));
effect(TEMP tmp);
ins_cost(VEC_COST);
- format %{ "reduce_addI_masked $dst, $src1, $src2, $v0\t# KILL $tmp" %}
+ format %{ "reduce_add_masked $dst, $src1, $src2, $v0\t# KILL $tmp" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this, $src2);
__ reduce_integral_v($dst$$Register, $src1$$Register,
@@ -2165,14 +2588,14 @@ instruct reduce_addD_masked(fRegD dst, fRegD src1, vReg src2, vRegMask_V0 v0, vR
// vector integer max reduction
-instruct vreduce_maxI(iRegINoSp dst, iRegIorL2I src1, vReg src2, vReg tmp) %{
+instruct vreduce_max(iRegINoSp dst, iRegIorL2I src1, vReg src2, vReg tmp) %{
predicate(Matcher::vector_element_basic_type(n->in(2)) == T_BYTE ||
Matcher::vector_element_basic_type(n->in(2)) == T_SHORT ||
Matcher::vector_element_basic_type(n->in(2)) == T_INT);
match(Set dst (MaxReductionV src1 src2));
ins_cost(VEC_COST);
effect(TEMP tmp);
- format %{ "vreduce_maxI $dst, $src1, $src2\t# KILL $tmp" %}
+ format %{ "vreduce_max $dst, $src1, $src2\t# KILL $tmp" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this, $src2);
__ reduce_integral_v($dst$$Register, $src1$$Register,
@@ -2199,14 +2622,14 @@ instruct vreduce_maxL(iRegLNoSp dst, iRegL src1, vReg src2, vReg tmp) %{
// vector integer max reduction - predicated
-instruct vreduce_maxI_masked(iRegINoSp dst, iRegIorL2I src1, vReg src2, vRegMask_V0 v0, vReg tmp) %{
+instruct vreduce_max_masked(iRegINoSp dst, iRegIorL2I src1, vReg src2, vRegMask_V0 v0, vReg tmp) %{
predicate(Matcher::vector_element_basic_type(n->in(2)) == T_BYTE ||
Matcher::vector_element_basic_type(n->in(2)) == T_SHORT ||
Matcher::vector_element_basic_type(n->in(2)) == T_INT);
match(Set dst (MaxReductionV (Binary src1 src2) v0));
effect(TEMP tmp);
ins_cost(VEC_COST);
- format %{ "vreduce_maxI_masked $dst, $src1, $src2, $v0\t# KILL $tmp" %}
+ format %{ "vreduce_max_masked $dst, $src1, $src2, $v0\t# KILL $tmp" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this, $src2);
__ reduce_integral_v($dst$$Register, $src1$$Register,
@@ -2235,14 +2658,14 @@ instruct vreduce_maxL_masked(iRegLNoSp dst, iRegL src1, vReg src2, vRegMask_V0 v
// vector integer min reduction
-instruct vreduce_minI(iRegINoSp dst, iRegIorL2I src1, vReg src2, vReg tmp) %{
+instruct vreduce_min(iRegINoSp dst, iRegIorL2I src1, vReg src2, vReg tmp) %{
predicate(Matcher::vector_element_basic_type(n->in(2)) == T_BYTE ||
Matcher::vector_element_basic_type(n->in(2)) == T_SHORT ||
Matcher::vector_element_basic_type(n->in(2)) == T_INT);
match(Set dst (MinReductionV src1 src2));
ins_cost(VEC_COST);
effect(TEMP tmp);
- format %{ "vreduce_minI $dst, $src1, $src2\t# KILL $tmp" %}
+ format %{ "vreduce_min $dst, $src1, $src2\t# KILL $tmp" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this, $src2);
__ reduce_integral_v($dst$$Register, $src1$$Register,
@@ -2269,14 +2692,14 @@ instruct vreduce_minL(iRegLNoSp dst, iRegL src1, vReg src2, vReg tmp) %{
// vector integer min reduction - predicated
-instruct vreduce_minI_masked(iRegINoSp dst, iRegIorL2I src1, vReg src2, vRegMask_V0 v0, vReg tmp) %{
+instruct vreduce_min_masked(iRegINoSp dst, iRegIorL2I src1, vReg src2, vRegMask_V0 v0, vReg tmp) %{
predicate(Matcher::vector_element_basic_type(n->in(2)) == T_BYTE ||
Matcher::vector_element_basic_type(n->in(2)) == T_SHORT ||
Matcher::vector_element_basic_type(n->in(2)) == T_INT);
match(Set dst (MinReductionV (Binary src1 src2) v0));
effect(TEMP tmp);
ins_cost(VEC_COST);
- format %{ "vreduce_minI_masked $dst, $src1, $src2, $v0\t# KILL $tmp" %}
+ format %{ "vreduce_min_masked $dst, $src1, $src2, $v0\t# KILL $tmp" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this, $src2);
__ reduce_integral_v($dst$$Register, $src1$$Register,
@@ -2984,10 +3407,10 @@ instruct vlsrL_masked(vReg dst_src, vReg shift, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vasrB_imm(vReg dst, vReg src, immI shift) %{
+instruct vasrB_vi(vReg dst, vReg src, immI shift) %{
match(Set dst (RShiftVB src (RShiftCntV shift)));
ins_cost(VEC_COST);
- format %{ "vasrB_imm $dst, $src, $shift" %}
+ format %{ "vasrB_vi $dst, $src, $shift" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
__ vsetvli_helper(T_BYTE, Matcher::vector_length(this));
@@ -3002,10 +3425,10 @@ instruct vasrB_imm(vReg dst, vReg src, immI shift) %{
ins_pipe(pipe_slow);
%}
-instruct vasrS_imm(vReg dst, vReg src, immI shift) %{
+instruct vasrS_vi(vReg dst, vReg src, immI shift) %{
match(Set dst (RShiftVS src (RShiftCntV shift)));
ins_cost(VEC_COST);
- format %{ "vasrS_imm $dst, $src, $shift" %}
+ format %{ "vasrS_vi $dst, $src, $shift" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
__ vsetvli_helper(T_SHORT, Matcher::vector_length(this));
@@ -3020,10 +3443,10 @@ instruct vasrS_imm(vReg dst, vReg src, immI shift) %{
ins_pipe(pipe_slow);
%}
-instruct vasrI_imm(vReg dst, vReg src, immI shift) %{
+instruct vasrI_vi(vReg dst, vReg src, immI shift) %{
match(Set dst (RShiftVI src (RShiftCntV shift)));
ins_cost(VEC_COST);
- format %{ "vasrI_imm $dst, $src, $shift" %}
+ format %{ "vasrI_vi $dst, $src, $shift" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
__ vsetvli_helper(T_INT, Matcher::vector_length(this));
@@ -3037,11 +3460,11 @@ instruct vasrI_imm(vReg dst, vReg src, immI shift) %{
ins_pipe(pipe_slow);
%}
-instruct vasrL_imm(vReg dst, vReg src, immI shift) %{
+instruct vasrL_vi(vReg dst, vReg src, immI shift) %{
predicate((n->in(2)->in(1)->get_int() & 0x3f) < 32);
match(Set dst (RShiftVL src (RShiftCntV shift)));
ins_cost(VEC_COST);
- format %{ "vasrL_imm $dst, $src, $shift" %}
+ format %{ "vasrL_vi $dst, $src, $shift" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
@@ -3055,10 +3478,10 @@ instruct vasrL_imm(vReg dst, vReg src, immI shift) %{
ins_pipe(pipe_slow);
%}
-instruct vasrB_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
+instruct vasrB_vi_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
match(Set dst_src (RShiftVB (Binary dst_src (RShiftCntV shift)) v0));
ins_cost(VEC_COST);
- format %{ "vasrB_imm_masked $dst_src, $dst_src, $shift, $v0" %}
+ format %{ "vasrB_vi_masked $dst_src, $dst_src, $shift, $v0" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
if (con == 0) {
@@ -3072,10 +3495,10 @@ instruct vasrB_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vasrS_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
+instruct vasrS_vi_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
match(Set dst_src (RShiftVS (Binary dst_src (RShiftCntV shift)) v0));
ins_cost(VEC_COST);
- format %{ "vasrS_imm_masked $dst_src, $dst_src, $shift, $v0" %}
+ format %{ "vasrS_vi_masked $dst_src, $dst_src, $shift, $v0" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
if (con == 0) {
@@ -3089,10 +3512,10 @@ instruct vasrS_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vasrI_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
+instruct vasrI_vi_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
match(Set dst_src (RShiftVI (Binary dst_src (RShiftCntV shift)) v0));
ins_cost(VEC_COST);
- format %{ "vasrI_imm_masked $dst_src, $dst_src, $shift, $v0" %}
+ format %{ "vasrI_vi_masked $dst_src, $dst_src, $shift, $v0" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
if (con == 0) {
@@ -3105,11 +3528,11 @@ instruct vasrI_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vasrL_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
+instruct vasrL_vi_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
predicate((n->in(1)->in(2)->in(1)->get_int() & 0x3f) < 32);
match(Set dst_src (RShiftVL (Binary dst_src (RShiftCntV shift)) v0));
ins_cost(VEC_COST);
- format %{ "vasrL_imm_masked $dst_src, $dst_src, $shift, $v0" %}
+ format %{ "vasrL_vi_masked $dst_src, $dst_src, $shift, $v0" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
if (con == 0) {
@@ -3122,10 +3545,10 @@ instruct vasrL_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vlsrB_imm(vReg dst, vReg src, immI shift) %{
+instruct vlsrB_vi(vReg dst, vReg src, immI shift) %{
match(Set dst (URShiftVB src (RShiftCntV shift)));
ins_cost(VEC_COST);
- format %{ "vlsrB_imm $dst, $src, $shift" %}
+ format %{ "vlsrB_vi $dst, $src, $shift" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
__ vsetvli_helper(T_BYTE, Matcher::vector_length(this));
@@ -3144,10 +3567,10 @@ instruct vlsrB_imm(vReg dst, vReg src, immI shift) %{
ins_pipe(pipe_slow);
%}
-instruct vlsrS_imm(vReg dst, vReg src, immI shift) %{
+instruct vlsrS_vi(vReg dst, vReg src, immI shift) %{
match(Set dst (URShiftVS src (RShiftCntV shift)));
ins_cost(VEC_COST);
- format %{ "vlsrS_imm $dst, $src, $shift" %}
+ format %{ "vlsrS_vi $dst, $src, $shift" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
__ vsetvli_helper(T_SHORT, Matcher::vector_length(this));
@@ -3166,10 +3589,10 @@ instruct vlsrS_imm(vReg dst, vReg src, immI shift) %{
ins_pipe(pipe_slow);
%}
-instruct vlsrI_imm(vReg dst, vReg src, immI shift) %{
+instruct vlsrI_vi(vReg dst, vReg src, immI shift) %{
match(Set dst (URShiftVI src (RShiftCntV shift)));
ins_cost(VEC_COST);
- format %{ "vlsrI_imm $dst, $src, $shift" %}
+ format %{ "vlsrI_vi $dst, $src, $shift" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
__ vsetvli_helper(T_INT, Matcher::vector_length(this));
@@ -3183,11 +3606,11 @@ instruct vlsrI_imm(vReg dst, vReg src, immI shift) %{
ins_pipe(pipe_slow);
%}
-instruct vlsrL_imm(vReg dst, vReg src, immI shift) %{
+instruct vlsrL_vi(vReg dst, vReg src, immI shift) %{
predicate((n->in(2)->in(1)->get_int() & 0x3f) < 32);
match(Set dst (URShiftVL src (RShiftCntV shift)));
ins_cost(VEC_COST);
- format %{ "vlsrL_imm $dst, $src, $shift" %}
+ format %{ "vlsrL_vi $dst, $src, $shift" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
@@ -3201,10 +3624,10 @@ instruct vlsrL_imm(vReg dst, vReg src, immI shift) %{
ins_pipe(pipe_slow);
%}
-instruct vlsrB_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
+instruct vlsrB_vi_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
match(Set dst_src (URShiftVB (Binary dst_src (RShiftCntV shift)) v0));
ins_cost(VEC_COST);
- format %{ "vlsrB_imm_masked $dst_src, $dst_src, $shift, $v0" %}
+ format %{ "vlsrB_vi_masked $dst_src, $dst_src, $shift, $v0" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
if (con == 0) {
@@ -3222,10 +3645,10 @@ instruct vlsrB_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vlsrS_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
+instruct vlsrS_vi_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
match(Set dst_src (URShiftVS (Binary dst_src (RShiftCntV shift)) v0));
ins_cost(VEC_COST);
- format %{ "vlsrS_imm_masked $dst_src, $dst_src, $shift, $v0" %}
+ format %{ "vlsrS_vi_masked $dst_src, $dst_src, $shift, $v0" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
if (con == 0) {
@@ -3243,10 +3666,10 @@ instruct vlsrS_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vlsrI_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
+instruct vlsrI_vi_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
match(Set dst_src (URShiftVI (Binary dst_src (RShiftCntV shift)) v0));
ins_cost(VEC_COST);
- format %{ "vlsrI_imm_masked $dst_src, $dst_src, $shift, $v0" %}
+ format %{ "vlsrI_vi_masked $dst_src, $dst_src, $shift, $v0" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
if (con == 0) {
@@ -3259,11 +3682,11 @@ instruct vlsrI_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vlsrL_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
+instruct vlsrL_vi_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
predicate((n->in(1)->in(2)->in(1)->get_int() & 0x3f) < 32);
match(Set dst_src (URShiftVL (Binary dst_src (RShiftCntV shift)) v0));
ins_cost(VEC_COST);
- format %{ "vlsrL_imm_masked $dst_src, $dst_src, $shift, $v0" %}
+ format %{ "vlsrL_vi_masked $dst_src, $dst_src, $shift, $v0" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
if (con == 0) {
@@ -3276,10 +3699,10 @@ instruct vlsrL_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vlslB_imm(vReg dst, vReg src, immI shift) %{
+instruct vlslB_vi(vReg dst, vReg src, immI shift) %{
match(Set dst (LShiftVB src (LShiftCntV shift)));
ins_cost(VEC_COST);
- format %{ "vlslB_imm $dst, $src, $shift" %}
+ format %{ "vlslB_vi $dst, $src, $shift" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
__ vsetvli_helper(T_BYTE, Matcher::vector_length(this));
@@ -3293,10 +3716,10 @@ instruct vlslB_imm(vReg dst, vReg src, immI shift) %{
ins_pipe(pipe_slow);
%}
-instruct vlslS_imm(vReg dst, vReg src, immI shift) %{
+instruct vlslS_vi(vReg dst, vReg src, immI shift) %{
match(Set dst (LShiftVS src (LShiftCntV shift)));
ins_cost(VEC_COST);
- format %{ "vlslS_imm $dst, $src, $shift" %}
+ format %{ "vlslS_vi $dst, $src, $shift" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
__ vsetvli_helper(T_SHORT, Matcher::vector_length(this));
@@ -3310,10 +3733,10 @@ instruct vlslS_imm(vReg dst, vReg src, immI shift) %{
ins_pipe(pipe_slow);
%}
-instruct vlslI_imm(vReg dst, vReg src, immI shift) %{
+instruct vlslI_vi(vReg dst, vReg src, immI shift) %{
match(Set dst (LShiftVI src (LShiftCntV shift)));
ins_cost(VEC_COST);
- format %{ "vlslI_imm $dst, $src, $shift" %}
+ format %{ "vlslI_vi $dst, $src, $shift" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
__ vsetvli_helper(T_INT, Matcher::vector_length(this));
@@ -3322,11 +3745,11 @@ instruct vlslI_imm(vReg dst, vReg src, immI shift) %{
ins_pipe(pipe_slow);
%}
-instruct vlslL_imm(vReg dst, vReg src, immI shift) %{
+instruct vlslL_vi(vReg dst, vReg src, immI shift) %{
predicate((n->in(2)->in(1)->get_int() & 0x3f) < 32);
match(Set dst (LShiftVL src (LShiftCntV shift)));
ins_cost(VEC_COST);
- format %{ "vlslL_imm $dst, $src, $shift" %}
+ format %{ "vlslL_vi $dst, $src, $shift" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
@@ -3335,10 +3758,10 @@ instruct vlslL_imm(vReg dst, vReg src, immI shift) %{
ins_pipe(pipe_slow);
%}
-instruct vlslB_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
+instruct vlslB_vi_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
match(Set dst_src (LShiftVB (Binary dst_src (LShiftCntV shift)) v0));
ins_cost(VEC_COST);
- format %{ "vlslB_imm_masked $dst_src, $dst_src, $shift, $v0" %}
+ format %{ "vlslB_vi_masked $dst_src, $dst_src, $shift, $v0" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
__ vsetvli_helper(T_BYTE, Matcher::vector_length(this));
@@ -3353,10 +3776,10 @@ instruct vlslB_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vlslS_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
+instruct vlslS_vi_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
match(Set dst_src (LShiftVS (Binary dst_src (LShiftCntV shift)) v0));
ins_cost(VEC_COST);
- format %{ "vlslS_imm_masked $dst_src, $dst_src, $shift, $v0" %}
+ format %{ "vlslS_vi_masked $dst_src, $dst_src, $shift, $v0" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
__ vsetvli_helper(T_SHORT, Matcher::vector_length(this));
@@ -3371,10 +3794,10 @@ instruct vlslS_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vlslI_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
+instruct vlslI_vi_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
match(Set dst_src (LShiftVI (Binary dst_src (LShiftCntV shift)) v0));
ins_cost(VEC_COST);
- format %{ "vlslI_imm_masked $dst_src, $dst_src, $shift, $v0" %}
+ format %{ "vlslI_vi_masked $dst_src, $dst_src, $shift, $v0" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
__ vsetvli_helper(T_INT, Matcher::vector_length(this));
@@ -3384,11 +3807,11 @@ instruct vlslI_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vlslL_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
+instruct vlslL_vi_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
predicate((n->in(1)->in(2)->in(1)->get_int() & 0x3f) < 32);
match(Set dst_src (LShiftVL (Binary dst_src (LShiftCntV shift)) v0));
ins_cost(VEC_COST);
- format %{ "vlslL_imm_masked $dst_src, $dst_src, $shift, $v0" %}
+ format %{ "vlslL_vi_masked $dst_src, $dst_src, $shift, $v0" %}
ins_encode %{
uint32_t con = (unsigned)$shift$$constant & 0x1f;
__ vsetvli_helper(T_LONG, Matcher::vector_length(this));
@@ -3413,27 +3836,6 @@ instruct vshiftcnt(vReg dst, iRegIorL2I cnt) %{
%}
// --------------------------------- Vector Rotation ----------------------------------
-//
-// Following rotate instruct's are shared by vectorization (in SLP, superword.cpp) and Vector API.
-//
-// Rotate behaviour in vectorization is defined by java API, which includes:
-// 1. Integer.rorateRight, Integer.rorateLeft.
-// `rotation by any multiple of 32 is a no-op, so all but the last five bits of the rotation distance can be ignored`.
-// 2. Long.rorateRight, Long.rorateLeft.
-// `rotation by any multiple of 64 is a no-op, so all but the last six bits of the rotation distance can be ignored`.
-//
-// Rotate behaviour in Vector API is defined as below, e.g.
-// 1. For Byte ROR, `a ROR b` is: (byte)(((((byte)a) & 0xFF) >>> (b & 7)) | ((((byte)a) & 0xFF) << (8 - (b & 7))))
-// 2. For Short ROR, `a ROR b` is: (short)(((((short)a) & 0xFFFF) >>> (b & 15)) | ((((short)a) & 0xFFFF) << (16 - (b & 15))))
-// 3. For Integer ROR, `a ROR b` is: Integer.rotateRight(a, ((int)b))
-// 4. For Long ROR, `a ROR b` is: Long.rotateRight(a, ((int)b))
-//
-// Basically, the behaviour between vectorization and Vector API is the same for Long and Integer, except that Vector API
-// also supports Byte and Short rotation. But we can still share the intrinsics between vectorization and Vector API.
-//
-// NOTE: As vror.vi encodes 6-bits immediate rotate amount, which is different from other vector-immediate instructions,
-// implementation of vector rotation for long and other types can be unified.
-
// Rotate right
instruct vrotate_right(vReg dst, vReg src, vReg shift) %{
@@ -3448,9 +3850,10 @@ instruct vrotate_right(vReg dst, vReg src, vReg shift) %{
ins_pipe(pipe_slow);
%}
-instruct vrotate_right_reg(vReg dst, vReg src, iRegIorL2I shift) %{
+// Only the low log2(SEW) bits of shift value are used, all other bits are ignored.
+instruct vrotate_right_vx(vReg dst, vReg src, iRegIorL2I shift) %{
match(Set dst (RotateRightV src (Replicate shift)));
- format %{ "vrotate_right_reg $dst, $src, $shift\t" %}
+ format %{ "vrotate_right_vx $dst, $src, $shift\t" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -3460,9 +3863,9 @@ instruct vrotate_right_reg(vReg dst, vReg src, iRegIorL2I shift) %{
ins_pipe(pipe_slow);
%}
-instruct vrotate_right_imm(vReg dst, vReg src, immI shift) %{
+instruct vrotate_right_vi(vReg dst, vReg src, immI shift) %{
match(Set dst (RotateRightV src shift));
- format %{ "vrotate_right_imm $dst, $src, $shift\t" %}
+ format %{ "vrotate_right_vi $dst, $src, $shift\t" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
uint32_t bits = type2aelembytes(bt) * 8;
@@ -3480,7 +3883,7 @@ instruct vrotate_right_imm(vReg dst, vReg src, immI shift) %{
instruct vrotate_right_masked(vReg dst_src, vReg shift, vRegMask_V0 v0) %{
match(Set dst_src (RotateRightV (Binary dst_src shift) v0));
- format %{ "vrotate_right_masked $dst_src, $dst_src, $shift, v0.t\t" %}
+ format %{ "vrotate_right_masked $dst_src, $dst_src, $shift, $v0\t" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -3490,9 +3893,10 @@ instruct vrotate_right_masked(vReg dst_src, vReg shift, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vrotate_right_reg_masked(vReg dst_src, iRegIorL2I shift, vRegMask_V0 v0) %{
+// Only the low log2(SEW) bits of shift value are used, all other bits are ignored.
+instruct vrotate_right_vx_masked(vReg dst_src, iRegIorL2I shift, vRegMask_V0 v0) %{
match(Set dst_src (RotateRightV (Binary dst_src (Replicate shift)) v0));
- format %{ "vrotate_right_reg_masked $dst_src, $dst_src, $shift, v0.t\t" %}
+ format %{ "vrotate_right_vx_masked $dst_src, $dst_src, $shift, $v0\t" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -3502,9 +3906,9 @@ instruct vrotate_right_reg_masked(vReg dst_src, iRegIorL2I shift, vRegMask_V0 v0
ins_pipe(pipe_slow);
%}
-instruct vrotate_right_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
+instruct vrotate_right_vi_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
match(Set dst_src (RotateRightV (Binary dst_src shift) v0));
- format %{ "vrotate_right_imm_masked $dst_src, $dst_src, $shift, v0.t\t" %}
+ format %{ "vrotate_right_vi_masked $dst_src, $dst_src, $shift, $v0\t" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
uint32_t bits = type2aelembytes(bt) * 8;
@@ -3533,9 +3937,10 @@ instruct vrotate_left(vReg dst, vReg src, vReg shift) %{
ins_pipe(pipe_slow);
%}
-instruct vrotate_left_reg(vReg dst, vReg src, iRegIorL2I shift) %{
+// Only the low log2(SEW) bits of shift value are used, all other bits are ignored.
+instruct vrotate_left_vx(vReg dst, vReg src, iRegIorL2I shift) %{
match(Set dst (RotateLeftV src (Replicate shift)));
- format %{ "vrotate_left_reg $dst, $src, $shift\t" %}
+ format %{ "vrotate_left_vx $dst, $src, $shift\t" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -3545,9 +3950,9 @@ instruct vrotate_left_reg(vReg dst, vReg src, iRegIorL2I shift) %{
ins_pipe(pipe_slow);
%}
-instruct vrotate_left_imm(vReg dst, vReg src, immI shift) %{
+instruct vrotate_left_vi(vReg dst, vReg src, immI shift) %{
match(Set dst (RotateLeftV src shift));
- format %{ "vrotate_left_imm $dst, $src, $shift\t" %}
+ format %{ "vrotate_left_vi $dst, $src, $shift\t" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
uint32_t bits = type2aelembytes(bt) * 8;
@@ -3566,7 +3971,7 @@ instruct vrotate_left_imm(vReg dst, vReg src, immI shift) %{
instruct vrotate_left_masked(vReg dst_src, vReg shift, vRegMask_V0 v0) %{
match(Set dst_src (RotateLeftV (Binary dst_src shift) v0));
- format %{ "vrotate_left_masked $dst_src, $dst_src, $shift, v0.t\t" %}
+ format %{ "vrotate_left_masked $dst_src, $dst_src, $shift, $v0\t" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -3576,9 +3981,10 @@ instruct vrotate_left_masked(vReg dst_src, vReg shift, vRegMask_V0 v0) %{
ins_pipe(pipe_slow);
%}
-instruct vrotate_left_reg_masked(vReg dst_src, iRegIorL2I shift, vRegMask_V0 v0) %{
+// Only the low log2(SEW) bits of shift value are used, all other bits are ignored.
+instruct vrotate_left_vx_masked(vReg dst_src, iRegIorL2I shift, vRegMask_V0 v0) %{
match(Set dst_src (RotateLeftV (Binary dst_src (Replicate shift)) v0));
- format %{ "vrotate_left_reg_masked $dst_src, $dst_src, $shift, v0.t\t" %}
+ format %{ "vrotate_left_vx_masked $dst_src, $dst_src, $shift, $v0\t" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -3588,9 +3994,9 @@ instruct vrotate_left_reg_masked(vReg dst_src, iRegIorL2I shift, vRegMask_V0 v0)
ins_pipe(pipe_slow);
%}
-instruct vrotate_left_imm_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
+instruct vrotate_left_vi_masked(vReg dst_src, immI shift, vRegMask_V0 v0) %{
match(Set dst_src (RotateLeftV (Binary dst_src shift) v0));
- format %{ "vrotate_left_imm_masked $dst_src, $dst_src, $shift, v0.t\t" %}
+ format %{ "vrotate_left_vi_masked $dst_src, $dst_src, $shift, $v0\t" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
uint32_t bits = type2aelembytes(bt) * 8;
@@ -4186,8 +4592,8 @@ instruct vcvtStoB(vReg dst, vReg src) %{
%}
instruct vcvtStoX(vReg dst, vReg src) %{
- predicate((Matcher::vector_element_basic_type(n) == T_INT ||
- Matcher::vector_element_basic_type(n) == T_LONG));
+ predicate(Matcher::vector_element_basic_type(n) == T_INT ||
+ Matcher::vector_element_basic_type(n) == T_LONG);
match(Set dst (VectorCastS2X src));
effect(TEMP_DEF dst);
format %{ "vcvtStoX $dst, $src" %}
@@ -4200,8 +4606,8 @@ instruct vcvtStoX(vReg dst, vReg src) %{
%}
instruct vcvtStoX_fp(vReg dst, vReg src) %{
- predicate((Matcher::vector_element_basic_type(n) == T_FLOAT ||
- Matcher::vector_element_basic_type(n) == T_DOUBLE));
+ predicate(Matcher::vector_element_basic_type(n) == T_FLOAT ||
+ Matcher::vector_element_basic_type(n) == T_DOUBLE);
match(Set dst (VectorCastS2X src));
effect(TEMP_DEF dst);
format %{ "vcvtStoX_fp $dst, $src" %}
@@ -4298,9 +4704,9 @@ instruct vcvtItoD(vReg dst, vReg src) %{
// VectorCastL2X
instruct vcvtLtoI(vReg dst, vReg src) %{
- predicate(Matcher::vector_element_basic_type(n) == T_INT ||
- Matcher::vector_element_basic_type(n) == T_BYTE ||
- Matcher::vector_element_basic_type(n) == T_SHORT);
+ predicate(Matcher::vector_element_basic_type(n) == T_BYTE ||
+ Matcher::vector_element_basic_type(n) == T_SHORT ||
+ Matcher::vector_element_basic_type(n) == T_INT);
match(Set dst (VectorCastL2X src));
format %{ "vcvtLtoI $dst, $src" %}
ins_encode %{
@@ -5036,14 +5442,14 @@ instruct populateindex(vReg dst, iRegIorL2I src1, iRegIorL2I src2, vReg tmp) %{
// BYTE, SHORT, INT
-instruct insertI_index_lt32(vReg dst, vReg src, iRegIorL2I val, immI idx, vRegMask_V0 v0) %{
+instruct insert_index_lt32(vReg dst, vReg src, iRegIorL2I val, immI idx, vRegMask_V0 v0) %{
predicate(n->in(2)->get_int() < 32 &&
(Matcher::vector_element_basic_type(n) == T_BYTE ||
Matcher::vector_element_basic_type(n) == T_SHORT ||
Matcher::vector_element_basic_type(n) == T_INT));
match(Set dst (VectorInsert (Binary src val) idx));
effect(TEMP v0);
- format %{ "insertI_index_lt32 $dst, $src, $val, $idx" %}
+ format %{ "insert_index_lt32 $dst, $src, $val, $idx" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
@@ -5055,14 +5461,14 @@ instruct insertI_index_lt32(vReg dst, vReg src, iRegIorL2I val, immI idx, vRegMa
ins_pipe(pipe_slow);
%}
-instruct insertI_index(vReg dst, vReg src, iRegIorL2I val, iRegIorL2I idx, vReg tmp, vRegMask_V0 v0) %{
+instruct insert_index(vReg dst, vReg src, iRegIorL2I val, iRegIorL2I idx, vReg tmp, vRegMask_V0 v0) %{
predicate(n->in(2)->get_int() >= 32 &&
(Matcher::vector_element_basic_type(n) == T_BYTE ||
Matcher::vector_element_basic_type(n) == T_SHORT ||
Matcher::vector_element_basic_type(n) == T_INT));
match(Set dst (VectorInsert (Binary src val) idx));
effect(TEMP tmp, TEMP v0);
- format %{ "insertI_index $dst, $src, $val, $idx\t# KILL $tmp" %}
+ format %{ "insert_index $dst, $src, $val, $idx\t# KILL $tmp" %}
ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ vsetvli_helper(bt, Matcher::vector_length(this));
diff --git a/src/hotspot/cpu/riscv/runtime_riscv.cpp b/src/hotspot/cpu/riscv/runtime_riscv.cpp
index 44a8e35e285b9..7c8ca853bc40b 100644
--- a/src/hotspot/cpu/riscv/runtime_riscv.cpp
+++ b/src/hotspot/cpu/riscv/runtime_riscv.cpp
@@ -63,6 +63,9 @@ UncommonTrapBlob* OptoRuntime::generate_uncommon_trap_blob() {
// Setup code generation tools
const char* name = OptoRuntime::stub_name(OptoStubId::uncommon_trap_id);
CodeBuffer buffer(name, 2048, 1024);
+ if (buffer.blob() == nullptr) {
+ return nullptr;
+ }
MacroAssembler* masm = new MacroAssembler(&buffer);
assert_cond(masm != nullptr);
@@ -282,6 +285,9 @@ ExceptionBlob* OptoRuntime::generate_exception_blob() {
// Setup code generation tools
const char* name = OptoRuntime::stub_name(OptoStubId::exception_id);
CodeBuffer buffer(name, 2048, 1024);
+ if (buffer.blob() == nullptr) {
+ return nullptr;
+ }
MacroAssembler* masm = new MacroAssembler(&buffer);
assert_cond(masm != nullptr);
diff --git a/src/hotspot/cpu/riscv/sharedRuntime_riscv.cpp b/src/hotspot/cpu/riscv/sharedRuntime_riscv.cpp
index 49e630bbfdf91..5f53485a97fe1 100644
--- a/src/hotspot/cpu/riscv/sharedRuntime_riscv.cpp
+++ b/src/hotspot/cpu/riscv/sharedRuntime_riscv.cpp
@@ -596,12 +596,13 @@ void SharedRuntime::gen_i2c_adapter(MacroAssembler *masm,
}
// ---------------------------------------------------------------
-AdapterHandlerEntry* SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm,
- int total_args_passed,
- int comp_args_on_stack,
- const BasicType *sig_bt,
- const VMRegPair *regs,
- AdapterFingerPrint* fingerprint) {
+
+void SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm,
+ int total_args_passed,
+ int comp_args_on_stack,
+ const BasicType *sig_bt,
+ const VMRegPair *regs,
+ AdapterHandlerEntry* handler) {
address i2c_entry = __ pc();
gen_i2c_adapter(masm, total_args_passed, comp_args_on_stack, sig_bt, regs);
@@ -658,7 +659,8 @@ AdapterHandlerEntry* SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm
gen_c2i_adapter(masm, total_args_passed, comp_args_on_stack, sig_bt, regs, skip_fixup);
- return AdapterHandlerLibrary::new_entry(fingerprint, i2c_entry, c2i_entry, c2i_unverified_entry, c2i_no_clinit_check_entry);
+ handler->set_entry_points(i2c_entry, c2i_entry, c2i_unverified_entry, c2i_no_clinit_check_entry);
+ return;
}
int SharedRuntime::vector_calling_convention(VMRegPair *regs,
@@ -2646,7 +2648,7 @@ RuntimeStub* SharedRuntime::generate_resolve_blob(SharedStubId id, address desti
__ bnez(t1, pending);
// get the returned Method*
- __ get_vm_result_2(xmethod, xthread);
+ __ get_vm_result_metadata(xmethod, xthread);
__ sd(xmethod, Address(sp, reg_saver.reg_offset_in_bytes(xmethod)));
// x10 is where we want to jump, overwrite t1 which is saved and temporary
@@ -2664,7 +2666,7 @@ RuntimeStub* SharedRuntime::generate_resolve_blob(SharedStubId id, address desti
// exception pending => remove activation and forward to exception handler
- __ sd(zr, Address(xthread, JavaThread::vm_result_offset()));
+ __ sd(zr, Address(xthread, JavaThread::vm_result_oop_offset()));
__ ld(x10, Address(xthread, Thread::pending_exception_offset()));
__ far_jump(RuntimeAddress(StubRoutines::forward_exception_entry()));
diff --git a/src/hotspot/cpu/riscv/stubGenerator_riscv.cpp b/src/hotspot/cpu/riscv/stubGenerator_riscv.cpp
index 4527a32926f52..fb4539267ae98 100644
--- a/src/hotspot/cpu/riscv/stubGenerator_riscv.cpp
+++ b/src/hotspot/cpu/riscv/stubGenerator_riscv.cpp
@@ -6458,58 +6458,6 @@ static const int64_t right_3_bits = right_n_bits(3);
return start;
}
- void generate_vector_math_stubs() {
- if (!UseRVV) {
- log_info(library)("vector is not supported, skip loading vector math (sleef) library!");
- return;
- }
-
- // Get native vector math stub routine addresses
- void* libsleef = nullptr;
- char ebuf[1024];
- char dll_name[JVM_MAXPATHLEN];
- if (os::dll_locate_lib(dll_name, sizeof(dll_name), Arguments::get_dll_dir(), "sleef")) {
- libsleef = os::dll_load(dll_name, ebuf, sizeof ebuf);
- }
- if (libsleef == nullptr) {
- log_info(library)("Failed to load native vector math (sleef) library, %s!", ebuf);
- return;
- }
-
- // Method naming convention
- // All the methods are named as _
- //
- // Where:
- // is the operation name, e.g. sin, cos
- // is to indicate float/double
- // "fx/dx" for vector float/double operation
- // is the precision level
- // "u10/u05" represents 1.0/0.5 ULP error bounds
- // We use "u10" for all operations by default
- // But for those functions do not have u10 support, we use "u05" instead
- // rvv, indicates riscv vector extension
- //
- // e.g. sinfx_u10rvv is the method for computing vector float sin using rvv instructions
- //
- log_info(library)("Loaded library %s, handle " INTPTR_FORMAT, JNI_LIB_PREFIX "sleef" JNI_LIB_SUFFIX, p2i(libsleef));
-
- for (int op = 0; op < VectorSupport::NUM_VECTOR_OP_MATH; op++) {
- int vop = VectorSupport::VECTOR_OP_MATH_START + op;
- if (vop == VectorSupport::VECTOR_OP_TANH) { // skip tanh because of performance regression
- continue;
- }
-
- // The native library does not support u10 level of "hypot".
- const char* ulf = (vop == VectorSupport::VECTOR_OP_HYPOT) ? "u05" : "u10";
-
- snprintf(ebuf, sizeof(ebuf), "%sfx_%srvv", VectorSupport::mathname[op], ulf);
- StubRoutines::_vector_f_math[VectorSupport::VEC_SIZE_SCALABLE][op] = (address)os::dll_lookup(libsleef, ebuf);
-
- snprintf(ebuf, sizeof(ebuf), "%sdx_%srvv", VectorSupport::mathname[op], ulf);
- StubRoutines::_vector_d_math[VectorSupport::VEC_SIZE_SCALABLE][op] = (address)os::dll_lookup(libsleef, ebuf);
- }
- }
-
#endif // COMPILER2
/**
@@ -6741,8 +6689,6 @@ static const int64_t right_3_bits = right_n_bits(3);
generate_string_indexof_stubs();
- generate_vector_math_stubs();
-
#endif // COMPILER2
}
diff --git a/src/hotspot/cpu/riscv/templateInterpreterGenerator_riscv.cpp b/src/hotspot/cpu/riscv/templateInterpreterGenerator_riscv.cpp
index e96bd2e1f2a73..72e1180164b77 100644
--- a/src/hotspot/cpu/riscv/templateInterpreterGenerator_riscv.cpp
+++ b/src/hotspot/cpu/riscv/templateInterpreterGenerator_riscv.cpp
@@ -1725,11 +1725,11 @@ void TemplateInterpreterGenerator::generate_throw_exception() {
// preserve exception over this code sequence
__ pop_ptr(x10);
- __ sd(x10, Address(xthread, JavaThread::vm_result_offset()));
+ __ sd(x10, Address(xthread, JavaThread::vm_result_oop_offset()));
// remove the activation (without doing throws on illegalMonitorExceptions)
__ remove_activation(vtos, false, true, false);
// restore exception
- __ get_vm_result(x10, xthread);
+ __ get_vm_result_oop(x10, xthread);
// In between activations - previous activation type unknown yet
// compute continuation point - the continuation point expects the
diff --git a/src/hotspot/cpu/riscv/templateTable_riscv.cpp b/src/hotspot/cpu/riscv/templateTable_riscv.cpp
index 216cfdeed7930..a035326be0130 100644
--- a/src/hotspot/cpu/riscv/templateTable_riscv.cpp
+++ b/src/hotspot/cpu/riscv/templateTable_riscv.cpp
@@ -460,7 +460,7 @@ void TemplateTable::condy_helper(Label& Done) {
__ mv(rarg, (int) bytecode());
__ call_VM(obj, entry, rarg);
- __ get_vm_result_2(flags, xthread);
+ __ get_vm_result_metadata(flags, xthread);
// VMr = obj = base address to find primitive value to push
// VMr2 = flags = (tos, off) using format of CPCE::_flags
@@ -3657,8 +3657,7 @@ void TemplateTable::checkcast() {
__ push(atos); // save receiver for result, and for GC
call_VM(x10, CAST_FROM_FN_PTR(address, InterpreterRuntime::quicken_io_cc));
- // vm_result_2 has metadata result
- __ get_vm_result_2(x10, xthread);
+ __ get_vm_result_metadata(x10, xthread);
__ pop_reg(x13); // restore receiver
__ j(resolved);
@@ -3712,8 +3711,7 @@ void TemplateTable::instanceof() {
__ push(atos); // save receiver for result, and for GC
call_VM(x10, CAST_FROM_FN_PTR(address, InterpreterRuntime::quicken_io_cc));
- // vm_result_2 has metadata result
- __ get_vm_result_2(x10, xthread);
+ __ get_vm_result_metadata(x10, xthread);
__ pop_reg(x13); // restore receiver
__ verify_oop(x13);
__ load_klass(x13, x13);
diff --git a/src/hotspot/cpu/riscv/vm_version_riscv.cpp b/src/hotspot/cpu/riscv/vm_version_riscv.cpp
index a0de9d767bfb2..28c32ed33c824 100644
--- a/src/hotspot/cpu/riscv/vm_version_riscv.cpp
+++ b/src/hotspot/cpu/riscv/vm_version_riscv.cpp
@@ -248,14 +248,6 @@ void VM_Version::common_initialize() {
#ifdef COMPILER2
void VM_Version::c2_initialize() {
- if (UseCMoveUnconditionally) {
- FLAG_SET_DEFAULT(UseCMoveUnconditionally, false);
- }
-
- if (ConditionalMoveLimit > 0) {
- FLAG_SET_DEFAULT(ConditionalMoveLimit, 0);
- }
-
if (!UseRVV) {
FLAG_SET_DEFAULT(MaxVectorSize, 0);
} else {
@@ -476,7 +468,7 @@ void VM_Version::initialize_cpu_information(void) {
_no_of_threads = _no_of_cores;
_no_of_sockets = _no_of_cores;
snprintf(_cpu_name, CPU_TYPE_DESC_BUF_SIZE - 1, "RISCV64");
- snprintf(_cpu_desc, CPU_DETAILED_DESC_BUF_SIZE, "RISCV64 %s", features_string());
+ snprintf(_cpu_desc, CPU_DETAILED_DESC_BUF_SIZE, "RISCV64 %s", cpu_info_string());
_initialized = true;
}
diff --git a/src/hotspot/cpu/s390/c1_CodeStubs_s390.cpp b/src/hotspot/cpu/s390/c1_CodeStubs_s390.cpp
index c858a4b8cb14b..430928a66ed85 100644
--- a/src/hotspot/cpu/s390/c1_CodeStubs_s390.cpp
+++ b/src/hotspot/cpu/s390/c1_CodeStubs_s390.cpp
@@ -52,7 +52,7 @@ void RangeCheckStub::emit_code(LIR_Assembler* ce) {
CHECK_BAILOUT();
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
return;
}
@@ -74,7 +74,7 @@ void RangeCheckStub::emit_code(LIR_Assembler* ce) {
CHECK_BAILOUT();
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
}
PredicateFailedStub::PredicateFailedStub(CodeEmitInfo* info) {
@@ -88,7 +88,7 @@ void PredicateFailedStub::emit_code(LIR_Assembler* ce) {
CHECK_BAILOUT();
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
}
void CounterOverflowStub::emit_code(LIR_Assembler* ce) {
@@ -116,7 +116,7 @@ void DivByZeroStub::emit_code(LIR_Assembler* ce) {
ce->emit_call_c(Runtime1::entry_for (C1StubId::throw_div0_exception_id));
CHECK_BAILOUT();
ce->add_call_info_here(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
}
void ImplicitNullCheckStub::emit_code(LIR_Assembler* ce) {
@@ -134,7 +134,7 @@ void ImplicitNullCheckStub::emit_code(LIR_Assembler* ce) {
CHECK_BAILOUT();
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
}
// Note: pass object in Z_R1_scratch
@@ -147,7 +147,7 @@ void SimpleExceptionStub::emit_code(LIR_Assembler* ce) {
ce->emit_call_c(a);
CHECK_BAILOUT();
ce->add_call_info_here(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
}
NewInstanceStub::NewInstanceStub(LIR_Opr klass_reg, LIR_Opr result, ciInstanceKlass* klass, CodeEmitInfo* info, C1StubId stub_id) {
diff --git a/src/hotspot/cpu/s390/c1_FrameMap_s390.cpp b/src/hotspot/cpu/s390/c1_FrameMap_s390.cpp
index 9fa6da8341ff8..ddba445154a2a 100644
--- a/src/hotspot/cpu/s390/c1_FrameMap_s390.cpp
+++ b/src/hotspot/cpu/s390/c1_FrameMap_s390.cpp
@@ -144,13 +144,13 @@ LIR_Opr FrameMap::_caller_save_fpu_regs[] = {};
// c1 rnr -> FloatRegister
FloatRegister FrameMap::nr2floatreg (int rnr) {
assert(_init_done, "tables not initialized");
- debug_only(fpu_range_check(rnr);)
+ DEBUG_ONLY(fpu_range_check(rnr);)
return _fpu_rnr2reg[rnr];
}
void FrameMap::map_float_register(int rnr, FloatRegister reg) {
- debug_only(fpu_range_check(rnr);)
- debug_only(fpu_range_check(reg->encoding());)
+ DEBUG_ONLY(fpu_range_check(rnr);)
+ DEBUG_ONLY(fpu_range_check(reg->encoding());)
_fpu_rnr2reg[rnr] = reg; // mapping c1 regnr. -> FloatRegister
_fpu_reg2rnr[reg->encoding()] = rnr; // mapping assembler encoding -> c1 regnr.
}
diff --git a/src/hotspot/cpu/s390/c1_FrameMap_s390.hpp b/src/hotspot/cpu/s390/c1_FrameMap_s390.hpp
index 66ccc8de876c4..721995f41fe0a 100644
--- a/src/hotspot/cpu/s390/c1_FrameMap_s390.hpp
+++ b/src/hotspot/cpu/s390/c1_FrameMap_s390.hpp
@@ -107,7 +107,7 @@
static int fpu_reg2rnr (FloatRegister reg) {
assert(_init_done, "tables not initialized");
int c1rnr = _fpu_reg2rnr[reg->encoding()];
- debug_only(fpu_range_check(c1rnr);)
+ DEBUG_ONLY(fpu_range_check(c1rnr);)
return c1rnr;
}
diff --git a/src/hotspot/cpu/s390/c1_MacroAssembler_s390.cpp b/src/hotspot/cpu/s390/c1_MacroAssembler_s390.cpp
index 5691a2055b3a2..0e873250dca3b 100644
--- a/src/hotspot/cpu/s390/c1_MacroAssembler_s390.cpp
+++ b/src/hotspot/cpu/s390/c1_MacroAssembler_s390.cpp
@@ -69,17 +69,18 @@ void C1_MacroAssembler::lock_object(Register Rmark, Register Roop, Register Rbox
// Save object being locked into the BasicObjectLock...
z_stg(Roop, Address(Rbox, BasicObjectLock::obj_offset()));
- if (DiagnoseSyncOnValueBasedClasses != 0) {
- load_klass(tmp, Roop);
- z_tm(Address(tmp, Klass::misc_flags_offset()), KlassFlags::_misc_is_value_based_class);
- branch_optimized(Assembler::bcondAllOne, slow_case);
- }
-
assert(LockingMode != LM_MONITOR, "LM_MONITOR is already handled, by emit_lock()");
if (LockingMode == LM_LIGHTWEIGHT) {
lightweight_lock(Rbox, Roop, Rmark, tmp, slow_case);
} else if (LockingMode == LM_LEGACY) {
+
+ if (DiagnoseSyncOnValueBasedClasses != 0) {
+ load_klass(tmp, Roop);
+ z_tm(Address(tmp, Klass::misc_flags_offset()), KlassFlags::_misc_is_value_based_class);
+ branch_optimized(Assembler::bcondAllOne, slow_case);
+ }
+
NearLabel done;
// Load object header.
diff --git a/src/hotspot/cpu/s390/c1_Runtime1_s390.cpp b/src/hotspot/cpu/s390/c1_Runtime1_s390.cpp
index b5d804d283e4f..db04703ceb02e 100644
--- a/src/hotspot/cpu/s390/c1_Runtime1_s390.cpp
+++ b/src/hotspot/cpu/s390/c1_Runtime1_s390.cpp
@@ -86,10 +86,10 @@ int StubAssembler::call_RT(Register oop_result1, Register metadata_result, addre
// Make sure that the vm_results are cleared.
if (oop_result1->is_valid()) {
- clear_mem(Address(Z_thread, JavaThread::vm_result_offset()), sizeof(jlong));
+ clear_mem(Address(Z_thread, JavaThread::vm_result_oop_offset()), sizeof(jlong));
}
if (metadata_result->is_valid()) {
- clear_mem(Address(Z_thread, JavaThread::vm_result_2_offset()), sizeof(jlong));
+ clear_mem(Address(Z_thread, JavaThread::vm_result_metadata_offset()), sizeof(jlong));
}
if (frame_size() == no_frame_size) {
// Pop the stub frame.
@@ -109,10 +109,10 @@ int StubAssembler::call_RT(Register oop_result1, Register metadata_result, addre
// Get oop results if there are any and reset the values in the thread.
if (oop_result1->is_valid()) {
- get_vm_result(oop_result1);
+ get_vm_result_oop(oop_result1);
}
if (metadata_result->is_valid()) {
- get_vm_result_2(metadata_result);
+ get_vm_result_metadata(metadata_result);
}
return call_offset;
@@ -886,8 +886,8 @@ OopMapSet* Runtime1::generate_handle_exception(C1StubId id, StubAssembler *sasm)
DEBUG_ONLY(__ z_lay(reg_fp, Address(Z_SP, frame_size_in_bytes));)
// Make sure that the vm_results are cleared (may be unnecessary).
- __ clear_mem(Address(Z_thread, JavaThread::vm_result_offset()), sizeof(oop));
- __ clear_mem(Address(Z_thread, JavaThread::vm_result_2_offset()), sizeof(Metadata*));
+ __ clear_mem(Address(Z_thread, JavaThread::vm_result_oop_offset()), sizeof(oop));
+ __ clear_mem(Address(Z_thread, JavaThread::vm_result_metadata_offset()), sizeof(Metadata*));
break;
}
case C1StubId::handle_exception_nofpu_id:
diff --git a/src/hotspot/cpu/s390/disassembler_s390.cpp b/src/hotspot/cpu/s390/disassembler_s390.cpp
index 98cff15f2ae78..a69851cfdba24 100644
--- a/src/hotspot/cpu/s390/disassembler_s390.cpp
+++ b/src/hotspot/cpu/s390/disassembler_s390.cpp
@@ -62,7 +62,7 @@ address Disassembler::decode_instruction0(address here, outputStream * st, addre
if (Assembler::is_z_nop((long)instruction_2bytes)) {
#if 1
- st->print("nop "); // fill up to operand column, leads to better code comment alignment
+ st->print("nop "); // fill up to operand column, leads to better code comment alignment
next = here + 2;
#else
// Compact disassembler output. Does not work the easy way.
@@ -76,7 +76,7 @@ address Disassembler::decode_instruction0(address here, outputStream * st, addre
instruction_2bytes = *(uint16_t*)(here+2*n_nops);
}
if (n_nops <= 4) { // do not group few subsequent nops
- st->print("nop "); // fill up to operand column, leads to better code comment alignment
+ st->print("nop "); // fill up to operand column, leads to better code comment alignment
next = here + 2;
} else {
st->print("nop count=%d", n_nops);
diff --git a/src/hotspot/cpu/s390/frame_s390.cpp b/src/hotspot/cpu/s390/frame_s390.cpp
index 01ed22c7d8620..b602d0adce579 100644
--- a/src/hotspot/cpu/s390/frame_s390.cpp
+++ b/src/hotspot/cpu/s390/frame_s390.cpp
@@ -185,7 +185,8 @@ bool frame::is_interpreted_frame() const {
void frame::interpreter_frame_set_locals(intptr_t* locs) {
assert(is_interpreted_frame(), "interpreted frame expected");
- ijava_state_unchecked()->locals = (uint64_t)locs;
+ // set relativized locals
+ *addr_at(_z_ijava_idx(locals)) = (intptr_t) (locs - fp());
}
// sender_sp
@@ -340,7 +341,7 @@ bool frame::is_interpreted_frame_valid(JavaThread* thread) const {
if (MetaspaceObj::is_valid(cp) == false) return false;
// validate locals
- address locals = (address)(ijava_state_unchecked()->locals);
+ address locals = (address)interpreter_frame_locals();
return thread->is_in_stack_range_incl(locals, (address)fp());
}
diff --git a/src/hotspot/cpu/s390/frame_s390.hpp b/src/hotspot/cpu/s390/frame_s390.hpp
index 3a6b3f33a5527..ab15e75bc5bf5 100644
--- a/src/hotspot/cpu/s390/frame_s390.hpp
+++ b/src/hotspot/cpu/s390/frame_s390.hpp
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2016, 2024, Oracle and/or its affiliates. All rights reserved.
+ * Copyright (c) 2016, 2025, Oracle and/or its affiliates. All rights reserved.
* Copyright (c) 2016, 2024 SAP SE. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
@@ -330,6 +330,10 @@
#define _z_ijava_state_neg(_component) \
(int) (-frame::z_ijava_state_size + offset_of(frame::z_ijava_state, _component))
+// Frame slot index relative to fp
+#define _z_ijava_idx(_component) \
+ (_z_ijava_state_neg(_component) >> LogBytesPerWord)
+
// ENTRY_FRAME
struct z_entry_frame_locals {
@@ -494,8 +498,6 @@
inline z_ijava_state* ijava_state() const;
- // Where z_ijava_state.monitors is saved.
- inline BasicObjectLock** interpreter_frame_monitors_addr() const;
// Where z_ijava_state.esp is saved.
inline intptr_t** interpreter_frame_esp_addr() const;
@@ -513,6 +515,8 @@
// Next two functions read and write z_ijava_state.monitors.
private:
inline BasicObjectLock* interpreter_frame_monitors() const;
+
+ // Where z_ijava_state.monitors is saved.
inline void interpreter_frame_set_monitors(BasicObjectLock* monitors);
public:
diff --git a/src/hotspot/cpu/s390/frame_s390.inline.hpp b/src/hotspot/cpu/s390/frame_s390.inline.hpp
index d29106cfc40d6..59e23af7f487b 100644
--- a/src/hotspot/cpu/s390/frame_s390.inline.hpp
+++ b/src/hotspot/cpu/s390/frame_s390.inline.hpp
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2016, 2024, Oracle and/or its affiliates. All rights reserved.
+ * Copyright (c) 2016, 2025, Oracle and/or its affiliates. All rights reserved.
* Copyright (c) 2016, 2024 SAP SE. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
@@ -109,16 +109,19 @@ inline frame::z_ijava_state* frame::ijava_state() const {
return state;
}
-inline BasicObjectLock** frame::interpreter_frame_monitors_addr() const {
- return (BasicObjectLock**) &(ijava_state()->monitors);
-}
-
// The next two functions read and write z_ijava_state.monitors.
inline BasicObjectLock* frame::interpreter_frame_monitors() const {
- return *interpreter_frame_monitors_addr();
+ BasicObjectLock* result = (BasicObjectLock*) at_relative(_z_ijava_idx(monitors));
+ // make sure the pointer points inside the frame
+ assert(sp() <= (intptr_t*) result, "monitor end should be above the stack pointer");
+ assert((intptr_t*) result < fp(), "monitor end should be strictly below the frame pointer: result: " INTPTR_FORMAT " fp: " INTPTR_FORMAT, p2i(result), p2i(fp()));
+ return result;
}
+
inline void frame::interpreter_frame_set_monitors(BasicObjectLock* monitors) {
- *interpreter_frame_monitors_addr() = monitors;
+ assert(is_interpreted_frame(), "interpreted frame expected");
+ // set relativized monitors
+ ijava_state()->monitors = (intptr_t) ((intptr_t*)monitors - fp());
}
// Accessors
@@ -180,7 +183,8 @@ inline intptr_t* frame::link_or_null() const {
}
inline intptr_t* frame::interpreter_frame_locals() const {
- return (intptr_t*) (ijava_state()->locals);
+ intptr_t n = *addr_at(_z_ijava_idx(locals));
+ return &fp()[n]; // return relativized locals
}
inline intptr_t* frame::interpreter_frame_bcp_addr() const {
@@ -202,11 +206,14 @@ inline intptr_t* frame::interpreter_frame_expression_stack() const {
// Also begin is one past last monitor.
inline intptr_t* frame::interpreter_frame_top_frame_sp() {
- return (intptr_t*)ijava_state()->top_frame_sp;
+ intptr_t n = *addr_at(_z_ijava_idx(top_frame_sp));
+ return &fp()[n]; // return relativized locals
}
inline void frame::interpreter_frame_set_top_frame_sp(intptr_t* top_frame_sp) {
- ijava_state()->top_frame_sp = (intptr_t) top_frame_sp;
+ assert(is_interpreted_frame(), "interpreted frame expected");
+ // set relativized top_frame_sp
+ ijava_state()->top_frame_sp = (intptr_t) (top_frame_sp - fp());
}
inline void frame::interpreter_frame_set_sender_sp(intptr_t* sender_sp) {
diff --git a/src/hotspot/cpu/s390/gc/g1/g1BarrierSetAssembler_s390.cpp b/src/hotspot/cpu/s390/gc/g1/g1BarrierSetAssembler_s390.cpp
index 2054c3db36c50..dea3317270e71 100644
--- a/src/hotspot/cpu/s390/gc/g1/g1BarrierSetAssembler_s390.cpp
+++ b/src/hotspot/cpu/s390/gc/g1/g1BarrierSetAssembler_s390.cpp
@@ -24,16 +24,16 @@
*/
#include "asm/macroAssembler.inline.hpp"
-#include "registerSaver_s390.hpp"
-#include "gc/g1/g1CardTable.hpp"
#include "gc/g1/g1BarrierSet.hpp"
#include "gc/g1/g1BarrierSetAssembler.hpp"
#include "gc/g1/g1BarrierSetRuntime.hpp"
+#include "gc/g1/g1CardTable.hpp"
#include "gc/g1/g1DirtyCardQueue.hpp"
#include "gc/g1/g1HeapRegion.hpp"
#include "gc/g1/g1SATBMarkQueueSet.hpp"
#include "gc/g1/g1ThreadLocalData.hpp"
#include "interpreter/interp_masm.hpp"
+#include "registerSaver_s390.hpp"
#include "runtime/jniHandles.hpp"
#include "runtime/sharedRuntime.hpp"
#include "utilities/macros.hpp"
diff --git a/src/hotspot/cpu/s390/gc/shared/barrierSetNMethod_s390.cpp b/src/hotspot/cpu/s390/gc/shared/barrierSetNMethod_s390.cpp
index 85dcc0a4e73f3..88b3199e4e166 100644
--- a/src/hotspot/cpu/s390/gc/shared/barrierSetNMethod_s390.cpp
+++ b/src/hotspot/cpu/s390/gc/shared/barrierSetNMethod_s390.cpp
@@ -40,7 +40,7 @@ class NativeMethodBarrier: public NativeInstruction {
address get_patchable_data_address() const {
address inst_addr = get_barrier_start_address() + PATCHABLE_INSTRUCTION_OFFSET;
- debug_only(Assembler::is_z_cfi(*((long*)inst_addr)));
+ DEBUG_ONLY(Assembler::is_z_cfi(*((long*)inst_addr)));
return inst_addr + 2;
}
@@ -91,7 +91,7 @@ static NativeMethodBarrier* get_nmethod_barrier(nmethod* nm) {
address barrier_address = nm->code_begin() + nm->frame_complete_offset() - NativeMethodBarrier::BARRIER_TOTAL_LENGTH;
auto barrier = reinterpret_cast(barrier_address);
- debug_only(barrier->verify());
+ DEBUG_ONLY(barrier->verify());
return barrier;
}
diff --git a/src/hotspot/cpu/s390/interp_masm_s390.cpp b/src/hotspot/cpu/s390/interp_masm_s390.cpp
index 48f4c7293a291..aac130ea66e58 100644
--- a/src/hotspot/cpu/s390/interp_masm_s390.cpp
+++ b/src/hotspot/cpu/s390/interp_masm_s390.cpp
@@ -104,7 +104,15 @@ void InterpreterMacroAssembler::dispatch_base(TosState state, address* table, bo
}
{ Label OK;
// check if the locals pointer in Z_locals is correct
- z_cg(Z_locals, _z_ijava_state_neg(locals), Z_fp);
+
+ // _z_ijava_state_neg(locals)) is fp relativized, so we need to
+ // extract the pointer.
+
+ z_lg(Z_R1_scratch, Address(Z_fp, _z_ijava_state_neg(locals)));
+ z_sllg(Z_R1_scratch, Z_R1_scratch, Interpreter::logStackElementSize);
+ z_agr(Z_R1_scratch, Z_fp);
+
+ z_cgr(Z_locals, Z_R1_scratch);
z_bre(OK);
reentry = stop_chain_static(reentry, "invalid locals pointer Z_locals: " FILE_AND_LINE);
bind(OK);
@@ -444,7 +452,7 @@ void InterpreterMacroAssembler::gen_subtype_check(Register Rsub_klass,
// Useful if consumed previously by access via stackTop().
void InterpreterMacroAssembler::popx(int len) {
add2reg(Z_esp, len*Interpreter::stackElementSize);
- debug_only(verify_esp(Z_esp, Z_R1_scratch));
+ DEBUG_ONLY(verify_esp(Z_esp, Z_R1_scratch));
}
// Get Address object of stack top. No checks. No pop.
@@ -458,38 +466,38 @@ void InterpreterMacroAssembler::pop_i(Register r) {
z_l(r, Interpreter::expr_offset_in_bytes(0), Z_esp);
add2reg(Z_esp, Interpreter::stackElementSize);
assert_different_registers(r, Z_R1_scratch);
- debug_only(verify_esp(Z_esp, Z_R1_scratch));
+ DEBUG_ONLY(verify_esp(Z_esp, Z_R1_scratch));
}
void InterpreterMacroAssembler::pop_ptr(Register r) {
z_lg(r, Interpreter::expr_offset_in_bytes(0), Z_esp);
add2reg(Z_esp, Interpreter::stackElementSize);
assert_different_registers(r, Z_R1_scratch);
- debug_only(verify_esp(Z_esp, Z_R1_scratch));
+ DEBUG_ONLY(verify_esp(Z_esp, Z_R1_scratch));
}
void InterpreterMacroAssembler::pop_l(Register r) {
z_lg(r, Interpreter::expr_offset_in_bytes(0), Z_esp);
add2reg(Z_esp, 2*Interpreter::stackElementSize);
assert_different_registers(r, Z_R1_scratch);
- debug_only(verify_esp(Z_esp, Z_R1_scratch));
+ DEBUG_ONLY(verify_esp(Z_esp, Z_R1_scratch));
}
void InterpreterMacroAssembler::pop_f(FloatRegister f) {
mem2freg_opt(f, Address(Z_esp, Interpreter::expr_offset_in_bytes(0)), false);
add2reg(Z_esp, Interpreter::stackElementSize);
- debug_only(verify_esp(Z_esp, Z_R1_scratch));
+ DEBUG_ONLY(verify_esp(Z_esp, Z_R1_scratch));
}
void InterpreterMacroAssembler::pop_d(FloatRegister f) {
mem2freg_opt(f, Address(Z_esp, Interpreter::expr_offset_in_bytes(0)), true);
add2reg(Z_esp, 2*Interpreter::stackElementSize);
- debug_only(verify_esp(Z_esp, Z_R1_scratch));
+ DEBUG_ONLY(verify_esp(Z_esp, Z_R1_scratch));
}
void InterpreterMacroAssembler::push_i(Register r) {
assert_different_registers(r, Z_R1_scratch);
- debug_only(verify_esp(Z_esp, Z_R1_scratch));
+ DEBUG_ONLY(verify_esp(Z_esp, Z_R1_scratch));
z_st(r, Address(Z_esp));
add2reg(Z_esp, -Interpreter::stackElementSize);
}
@@ -501,7 +509,7 @@ void InterpreterMacroAssembler::push_ptr(Register r) {
void InterpreterMacroAssembler::push_l(Register r) {
assert_different_registers(r, Z_R1_scratch);
- debug_only(verify_esp(Z_esp, Z_R1_scratch));
+ DEBUG_ONLY(verify_esp(Z_esp, Z_R1_scratch));
int offset = -Interpreter::stackElementSize;
z_stg(r, Address(Z_esp, offset));
clear_mem(Address(Z_esp), Interpreter::stackElementSize);
@@ -509,13 +517,13 @@ void InterpreterMacroAssembler::push_l(Register r) {
}
void InterpreterMacroAssembler::push_f(FloatRegister f) {
- debug_only(verify_esp(Z_esp, Z_R1_scratch));
+ DEBUG_ONLY(verify_esp(Z_esp, Z_R1_scratch));
freg2mem_opt(f, Address(Z_esp), false);
add2reg(Z_esp, -Interpreter::stackElementSize);
}
void InterpreterMacroAssembler::push_d(FloatRegister d) {
- debug_only(verify_esp(Z_esp, Z_R1_scratch));
+ DEBUG_ONLY(verify_esp(Z_esp, Z_R1_scratch));
int offset = -Interpreter::stackElementSize;
freg2mem_opt(d, Address(Z_esp, offset));
add2reg(Z_esp, 2 * offset);
@@ -568,7 +576,10 @@ void InterpreterMacroAssembler::prepare_to_jump_from_interpreted(Register method
// Satisfy interpreter calling convention (see generate_normal_entry()).
z_lgr(Z_R10, Z_SP); // Set sender sp (aka initial caller sp, aka unextended sp).
// Record top_frame_sp, because the callee might modify it, if it's compiled.
- z_stg(Z_SP, _z_ijava_state_neg(top_frame_sp), Z_fp);
+ assert_different_registers(Z_R1, method);
+ z_sgrk(Z_R1, Z_SP, Z_fp);
+ z_srag(Z_R1, Z_R1, Interpreter::logStackElementSize);
+ z_stg(Z_R1, _z_ijava_state_neg(top_frame_sp), Z_fp);
save_bcp();
save_esp();
z_lgr(Z_method, method); // Set Z_method (kills Z_fp!).
@@ -616,7 +627,7 @@ void InterpreterMacroAssembler::verify_esp(Register Resp, Register Rtemp) {
// i.e. IJAVA_STATE.monitors > Resp.
NearLabel OK;
Register Rmonitors = Rtemp;
- z_lg(Rmonitors, _z_ijava_state_neg(monitors), Z_fp);
+ get_monitors(Rmonitors);
compareU64_and_branch(Rmonitors, Resp, bcondHigh, OK);
reentry = stop_chain_static(reentry, "too many pops: Z_esp points into monitor area");
bind(OK);
@@ -665,10 +676,28 @@ void InterpreterMacroAssembler::restore_esp() {
void InterpreterMacroAssembler::get_monitors(Register reg) {
asm_assert_ijava_state_magic(reg);
+#ifdef ASSERT
+ NearLabel ok;
+ z_cg(Z_fp, 0, Z_SP);
+ z_bre(ok);
+ stop("Z_fp is corrupted");
+ bind(ok);
+#endif // ASSERT
mem2reg_opt(reg, Address(Z_fp, _z_ijava_state_neg(monitors)));
+ z_slag(reg, reg, Interpreter::logStackElementSize);
+ z_agr(reg, Z_fp);
}
void InterpreterMacroAssembler::save_monitors(Register reg) {
+#ifdef ASSERT
+ NearLabel ok;
+ z_cg(Z_fp, 0, Z_SP);
+ z_bre(ok);
+ stop("Z_fp is corrupted");
+ bind(ok);
+#endif // ASSERT
+ z_sgr(reg, Z_fp);
+ z_srag(reg, reg, Interpreter::logStackElementSize);
reg2mem_opt(reg, Address(Z_fp, _z_ijava_state_neg(monitors)));
}
@@ -684,6 +713,8 @@ void InterpreterMacroAssembler::save_mdp(Register mdp) {
void InterpreterMacroAssembler::restore_locals() {
asm_assert_ijava_state_magic(Z_locals);
z_lg(Z_locals, Address(Z_fp, _z_ijava_state_neg(locals)));
+ z_sllg(Z_locals, Z_locals, Interpreter::logStackElementSize);
+ z_agr(Z_locals, Z_fp);
}
void InterpreterMacroAssembler::get_method(Register reg) {
@@ -827,12 +858,11 @@ void InterpreterMacroAssembler::unlock_if_synchronized_method(TosState state,
// register for unlock_object to pass to VM directly.
Register R_current_monitor = Z_ARG2;
Register R_monitor_block_bot = Z_ARG1;
- const Address monitor_block_top(Z_fp, _z_ijava_state_neg(monitors));
const Address monitor_block_bot(Z_fp, -frame::z_ijava_state_size);
bind(restart);
// Starting with top-most entry.
- z_lg(R_current_monitor, monitor_block_top);
+ get_monitors(R_current_monitor);
// Points to word before bottom of monitor block.
load_address(R_monitor_block_bot, monitor_block_bot);
z_bru(entry);
@@ -1002,16 +1032,16 @@ void InterpreterMacroAssembler::lock_object(Register monitor, Register object) {
// markWord header = obj->mark().set_unlocked();
- if (DiagnoseSyncOnValueBasedClasses != 0) {
- load_klass(tmp, object);
- z_tm(Address(tmp, Klass::misc_flags_offset()), KlassFlags::_misc_is_value_based_class);
- z_btrue(slow_case);
- }
-
if (LockingMode == LM_LIGHTWEIGHT) {
lightweight_lock(monitor, object, header, tmp, slow_case);
} else if (LockingMode == LM_LEGACY) {
+ if (DiagnoseSyncOnValueBasedClasses != 0) {
+ load_klass(tmp, object);
+ z_tm(Address(tmp, Klass::misc_flags_offset()), KlassFlags::_misc_is_value_based_class);
+ z_btrue(slow_case);
+ }
+
// Load markWord from object into header.
z_lg(header, hdr_offset, object);
diff --git a/src/hotspot/cpu/s390/macroAssembler_s390.cpp b/src/hotspot/cpu/s390/macroAssembler_s390.cpp
index 54370a4959b64..0129e6049783c 100644
--- a/src/hotspot/cpu/s390/macroAssembler_s390.cpp
+++ b/src/hotspot/cpu/s390/macroAssembler_s390.cpp
@@ -2304,7 +2304,7 @@ void MacroAssembler::call_VM_base(Register oop_result,
// Get oop result if there is one and reset the value in the thread.
if (oop_result->is_valid()) {
- get_vm_result(oop_result);
+ get_vm_result_oop(oop_result);
}
_last_calls_return_pc = return_pc; // Wipe out other (error handling) calls.
@@ -4067,22 +4067,22 @@ void MacroAssembler::set_thread_state(JavaThreadState new_state) {
store_const(Address(Z_thread, JavaThread::thread_state_offset()), new_state, Z_R0, false);
}
-void MacroAssembler::get_vm_result(Register oop_result) {
- z_lg(oop_result, Address(Z_thread, JavaThread::vm_result_offset()));
- clear_mem(Address(Z_thread, JavaThread::vm_result_offset()), sizeof(void*));
+void MacroAssembler::get_vm_result_oop(Register oop_result) {
+ z_lg(oop_result, Address(Z_thread, JavaThread::vm_result_oop_offset()));
+ clear_mem(Address(Z_thread, JavaThread::vm_result_oop_offset()), sizeof(void*));
verify_oop(oop_result, FILE_AND_LINE);
}
-void MacroAssembler::get_vm_result_2(Register result) {
- z_lg(result, Address(Z_thread, JavaThread::vm_result_2_offset()));
- clear_mem(Address(Z_thread, JavaThread::vm_result_2_offset()), sizeof(void*));
+void MacroAssembler::get_vm_result_metadata(Register result) {
+ z_lg(result, Address(Z_thread, JavaThread::vm_result_metadata_offset()));
+ clear_mem(Address(Z_thread, JavaThread::vm_result_metadata_offset()), sizeof(void*));
}
// We require that C code which does not return a value in vm_result will
// leave it undisturbed.
void MacroAssembler::set_vm_result(Register oop_result) {
- z_stg(oop_result, Address(Z_thread, JavaThread::vm_result_offset()));
+ z_stg(oop_result, Address(Z_thread, JavaThread::vm_result_oop_offset()));
}
// Explicit null checks (used for method handle code).
@@ -6363,11 +6363,17 @@ void MacroAssembler::lightweight_lock(Register basic_lock, Register obj, Registe
z_lg(mark, Address(obj, mark_offset));
if (UseObjectMonitorTable) {
- // Clear cache in case fast locking succeeds.
+ // Clear cache in case fast locking succeeds or we need to take the slow-path.
const Address om_cache_addr = Address(basic_lock, BasicObjectLock::lock_offset() + in_ByteSize((BasicLock::object_monitor_cache_offset_in_bytes())));
z_mvghi(om_cache_addr, 0);
}
+ if (DiagnoseSyncOnValueBasedClasses != 0) {
+ load_klass(temp1, obj);
+ z_tm(Address(temp1, Klass::misc_flags_offset()), KlassFlags::_misc_is_value_based_class);
+ z_brne(slow);
+ }
+
// First we need to check if the lock-stack has room for pushing the object reference.
z_lgf(top, Address(Z_thread, ls_top_offset));
@@ -6501,7 +6507,7 @@ void MacroAssembler::compiler_fast_lock_lightweight_object(Register obj, Registe
NearLabel slow_path;
if (UseObjectMonitorTable) {
- // Clear cache in case fast locking succeeds.
+ // Clear cache in case fast locking succeeds or we need to take the slow-path.
z_mvghi(Address(box, BasicLock::object_monitor_cache_offset_in_bytes()), 0);
}
diff --git a/src/hotspot/cpu/s390/macroAssembler_s390.hpp b/src/hotspot/cpu/s390/macroAssembler_s390.hpp
index 3f7744588d6ec..4aa7bb56f3ad4 100644
--- a/src/hotspot/cpu/s390/macroAssembler_s390.hpp
+++ b/src/hotspot/cpu/s390/macroAssembler_s390.hpp
@@ -816,8 +816,8 @@ class MacroAssembler: public Assembler {
void set_thread_state(JavaThreadState new_state);
// Read vm result from thread.
- void get_vm_result (Register oop_result);
- void get_vm_result_2(Register result);
+ void get_vm_result_oop (Register oop_result);
+ void get_vm_result_metadata(Register result);
// Vm result is currently getting hijacked to for oop preservation.
void set_vm_result(Register oop_result);
diff --git a/src/hotspot/cpu/s390/runtime_s390.cpp b/src/hotspot/cpu/s390/runtime_s390.cpp
index 4eedb3877d2ab..8f96ff55ccb4c 100644
--- a/src/hotspot/cpu/s390/runtime_s390.cpp
+++ b/src/hotspot/cpu/s390/runtime_s390.cpp
@@ -72,6 +72,9 @@ ExceptionBlob* OptoRuntime::generate_exception_blob() {
// Setup code generation tools
const char* name = OptoRuntime::stub_name(OptoStubId::exception_id);
CodeBuffer buffer(name, 2048, 1024);
+ if (buffer.blob() == nullptr) {
+ return nullptr;
+ }
MacroAssembler* masm = new MacroAssembler(&buffer);
Register handle_exception = Z_ARG5;
diff --git a/src/hotspot/cpu/s390/s390.ad b/src/hotspot/cpu/s390/s390.ad
index fc7de7e70e909..c32064be86d87 100644
--- a/src/hotspot/cpu/s390/s390.ad
+++ b/src/hotspot/cpu/s390/s390.ad
@@ -6581,7 +6581,7 @@ instruct mulHiL_reg_reg(revenRegL Rdst, roddRegL Rsrc1, iRegL Rsrc2, iRegL Rtmp1
Register tmp1 = $Rtmp1$$Register;
Register tmp2 = $Rdst$$Register;
// z/Architecture has only unsigned multiply (64 * 64 -> 128).
- // implementing mulhs(a,b) = mulhu(a,b) – (a & (b>>63)) – (b & (a>>63))
+ // implementing mulhs(a,b) = mulhu(a,b) - (a & (b>>63)) - (b & (a>>63))
__ z_srag(tmp2, src1, 63); // a>>63
__ z_srag(tmp1, src2, 63); // b>>63
__ z_ngr(tmp2, src2); // b & (a>>63)
@@ -10080,7 +10080,8 @@ instruct ShouldNotReachHere() %{
format %{ "ILLTRAP; ShouldNotReachHere" %}
ins_encode %{
if (is_reachable()) {
- __ stop(_halt_reason);
+ const char* str = __ code_string(_halt_reason);
+ __ stop(str);
}
%}
ins_pipe(pipe_class_dummy);
diff --git a/src/hotspot/cpu/s390/sharedRuntime_s390.cpp b/src/hotspot/cpu/s390/sharedRuntime_s390.cpp
index 7b6735eabccc6..bd5bbf4c7e5e6 100644
--- a/src/hotspot/cpu/s390/sharedRuntime_s390.cpp
+++ b/src/hotspot/cpu/s390/sharedRuntime_s390.cpp
@@ -2352,12 +2352,12 @@ void SharedRuntime::gen_i2c_adapter(MacroAssembler *masm,
__ z_br(Z_R1_scratch);
}
-AdapterHandlerEntry* SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm,
- int total_args_passed,
- int comp_args_on_stack,
- const BasicType *sig_bt,
- const VMRegPair *regs,
- AdapterFingerPrint* fingerprint) {
+void SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm,
+ int total_args_passed,
+ int comp_args_on_stack,
+ const BasicType *sig_bt,
+ const VMRegPair *regs,
+ AdapterHandlerEntry* handler) {
__ align(CodeEntryAlignment);
address i2c_entry = __ pc();
gen_i2c_adapter(masm, total_args_passed, comp_args_on_stack, sig_bt, regs);
@@ -2411,7 +2411,8 @@ AdapterHandlerEntry* SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm
gen_c2i_adapter(masm, total_args_passed, comp_args_on_stack, sig_bt, regs, skip_fixup);
- return AdapterHandlerLibrary::new_entry(fingerprint, i2c_entry, c2i_entry, c2i_unverified_entry, c2i_no_clinit_check_entry);
+ handler->set_entry_points(i2c_entry, c2i_entry, c2i_unverified_entry, c2i_no_clinit_check_entry);
+ return;
}
// This function returns the adjust size (in number of words) to a c2i adapter
@@ -2768,6 +2769,9 @@ UncommonTrapBlob* OptoRuntime::generate_uncommon_trap_blob() {
// Setup code generation tools
const char* name = OptoRuntime::stub_name(OptoStubId::uncommon_trap_id);
CodeBuffer buffer(name, 2048, 1024);
+ if (buffer.blob() == nullptr) {
+ return nullptr;
+ }
InterpreterMacroAssembler* masm = new InterpreterMacroAssembler(&buffer);
Register unroll_block_reg = Z_tmp_1;
@@ -3043,7 +3047,7 @@ RuntimeStub* SharedRuntime::generate_resolve_blob(SharedStubId id, address desti
RegisterSaver::restore_live_registers(masm, RegisterSaver::all_registers);
// get the returned method
- __ get_vm_result_2(Z_method);
+ __ get_vm_result_metadata(Z_method);
// We are back to the original state on entry and ready to go.
__ z_br(Z_R1_scratch);
@@ -3057,7 +3061,7 @@ RuntimeStub* SharedRuntime::generate_resolve_blob(SharedStubId id, address desti
// exception pending => remove activation and forward to exception handler
__ z_lgr(Z_R2, Z_R0); // pending_exception
- __ clear_mem(Address(Z_thread, JavaThread::vm_result_offset()), sizeof(jlong));
+ __ clear_mem(Address(Z_thread, JavaThread::vm_result_oop_offset()), sizeof(jlong));
__ load_const_optimized(Z_R1_scratch, StubRoutines::forward_exception_entry());
__ z_br(Z_R1_scratch);
diff --git a/src/hotspot/cpu/s390/templateInterpreterGenerator_s390.cpp b/src/hotspot/cpu/s390/templateInterpreterGenerator_s390.cpp
index 841f4f9ca4bd2..c3c99d7297d11 100644
--- a/src/hotspot/cpu/s390/templateInterpreterGenerator_s390.cpp
+++ b/src/hotspot/cpu/s390/templateInterpreterGenerator_s390.cpp
@@ -637,6 +637,8 @@ address TemplateInterpreterGenerator::generate_return_entry_for (TosState state,
Register sp_before_i2c_extension = Z_bcp;
__ z_lg(Z_fp, _z_abi(callers_sp), Z_SP); // Restore frame pointer.
__ z_lg(sp_before_i2c_extension, Address(Z_fp, _z_ijava_state_neg(top_frame_sp)));
+ __ z_slag(sp_before_i2c_extension, sp_before_i2c_extension, Interpreter::logStackElementSize);
+ __ z_agr(sp_before_i2c_extension, Z_fp);
__ resize_frame_absolute(sp_before_i2c_extension, Z_locals/*tmp*/, true/*load_fp*/);
// TODO(ZASM): necessary??
@@ -1134,7 +1136,11 @@ void TemplateInterpreterGenerator::generate_fixed_frame(bool native_call) {
__ z_agr(Z_locals, Z_esp);
// z_ijava_state->locals - i*BytesPerWord points to i-th Java local (i starts at 0)
// z_ijava_state->locals = Z_esp + parameter_count bytes
- __ z_stg(Z_locals, _z_ijava_state_neg(locals), fp);
+
+ __ z_sgrk(Z_R0, Z_locals, fp); // Z_R0 = Z_locals - fp();
+ __ z_srlg(Z_R0, Z_R0, Interpreter::logStackElementSize);
+ // Store relativized Z_locals, see frame::interpreter_frame_locals().
+ __ z_stg(Z_R0, _z_ijava_state_neg(locals), fp);
// z_ijava_state->oop_temp = nullptr;
__ store_const(Address(fp, oop_tmp_offset), 0);
@@ -1168,7 +1174,11 @@ void TemplateInterpreterGenerator::generate_fixed_frame(bool native_call) {
// z_ijava_state->monitors = fp - frame::z_ijava_state_size - Interpreter::stackElementSize;
// z_ijava_state->esp = Z_esp = z_ijava_state->monitors;
__ add2reg(Z_esp, -frame::z_ijava_state_size, fp);
- __ z_stg(Z_esp, _z_ijava_state_neg(monitors), fp);
+
+ __ z_sgrk(Z_R0, Z_esp, fp);
+ __ z_srag(Z_R0, Z_R0, Interpreter::logStackElementSize);
+ __ z_stg(Z_R0, _z_ijava_state_neg(monitors), fp);
+
__ add2reg(Z_esp, -Interpreter::stackElementSize);
__ z_stg(Z_esp, _z_ijava_state_neg(esp), fp);
@@ -1627,7 +1637,7 @@ address TemplateInterpreterGenerator::generate_native_entry(bool synchronized) {
__ add2reg(Rfirst_monitor, -(frame::z_ijava_state_size + (int)sizeof(BasicObjectLock)), Z_fp);
#ifdef ASSERT
NearLabel ok;
- __ z_lg(Z_R1, _z_ijava_state_neg(monitors), Z_fp);
+ __ get_monitors(Z_R1);
__ compareU64_and_branch(Rfirst_monitor, Z_R1, Assembler::bcondEqual, ok);
reentry = __ stop_chain_static(reentry, "native_entry:unlock: inconsistent z_ijava_state.monitors");
__ bind(ok);
@@ -2228,7 +2238,7 @@ void TemplateInterpreterGenerator::generate_throw_exception() {
__ remove_activation(vtos, noreg/*ret.pc already loaded*/, false/*throw exc*/, true/*install exc*/, false/*notify jvmti*/);
__ z_lg(Z_fp, _z_abi(callers_sp), Z_SP); // Restore frame pointer.
- __ get_vm_result(Z_ARG1); // Restore exception.
+ __ get_vm_result_oop(Z_ARG1); // Restore exception.
__ verify_oop(Z_ARG1);
__ z_lgr(Z_ARG2, return_pc); // Restore return address.
diff --git a/src/hotspot/cpu/s390/templateTable_s390.cpp b/src/hotspot/cpu/s390/templateTable_s390.cpp
index e6c0c7781a3ba..2b39cc8318cbf 100644
--- a/src/hotspot/cpu/s390/templateTable_s390.cpp
+++ b/src/hotspot/cpu/s390/templateTable_s390.cpp
@@ -65,7 +65,8 @@
// The actual size of each block heavily depends on the CPU capabilities and,
// of course, on the logic implemented in each block.
#ifdef ASSERT
- #define BTB_MINSIZE 256
+// With introduced assert in get_monitor() & set_monitor(), required block size is now 322.
+ #define BTB_MINSIZE 512
#else
#define BTB_MINSIZE 64
#endif
@@ -91,7 +92,8 @@
if (len > alignment) { \
tty->print_cr("%4d of %4d @ " INTPTR_FORMAT ": Block len for %s", \
len, alignment, e_addr-len, name); \
- guarantee(len <= alignment, "block too large"); \
+ guarantee(len <= alignment, "block too large, len = %d, alignment = %d", \
+ len, alignment); \
} \
guarantee(len == e_addr-b_addr, "block len mismatch"); \
}
@@ -112,7 +114,8 @@
if (len > alignment) { \
tty->print_cr("%4d of %4d @ " INTPTR_FORMAT ": Block len for %s", \
len, alignment, e_addr-len, name); \
- guarantee(len <= alignment, "block too large"); \
+ guarantee(len <= alignment, "block too large, len = %d, alignment = %d", \
+ len, alignment); \
} \
guarantee(len == e_addr-b_addr, "block len mismatch"); \
}
@@ -540,7 +543,7 @@ void TemplateTable::condy_helper(Label& Done) {
const Register rarg = Z_ARG2;
__ load_const_optimized(rarg, (int)bytecode());
call_VM(obj, CAST_FROM_FN_PTR(address, InterpreterRuntime::resolve_ldc), rarg);
- __ get_vm_result_2(flags);
+ __ get_vm_result_metadata(flags);
// VMr = obj = base address to find primitive value to push
// VMr2 = flags = (tos, off) using format of CPCE::_flags
@@ -4063,7 +4066,7 @@ void TemplateTable::checkcast() {
__ push(atos); // Save receiver for result, and for GC.
call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::quicken_io_cc));
- __ get_vm_result_2(Z_tos);
+ __ get_vm_result_metadata(Z_tos);
Register receiver = Z_ARG4;
Register klass = Z_tos;
@@ -4135,7 +4138,7 @@ void TemplateTable::instanceof() {
__ push(atos); // Save receiver for result, and for GC.
call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::quicken_io_cc));
- __ get_vm_result_2(Z_tos);
+ __ get_vm_result_metadata(Z_tos);
Register receiver = Z_tmp_2;
Register klass = Z_tos;
diff --git a/src/hotspot/cpu/s390/vm_version_s390.cpp b/src/hotspot/cpu/s390/vm_version_s390.cpp
index 157b945e6e1a4..8261fbd083aae 100644
--- a/src/hotspot/cpu/s390/vm_version_s390.cpp
+++ b/src/hotspot/cpu/s390/vm_version_s390.cpp
@@ -90,7 +90,7 @@ static const char* z_features[] = {" ",
void VM_Version::initialize() {
determine_features(); // Get processor capabilities.
- set_features_string(); // Set a descriptive feature indication.
+ set_cpu_info_string(); // Set a descriptive feature indication.
if (Verbose || PrintAssembly || PrintStubCode) {
print_features_internal("CPU Version as detected internally:", PrintAssembly || PrintStubCode);
@@ -388,9 +388,9 @@ int VM_Version::get_model_index() {
}
-void VM_Version::set_features_string() {
- // A note on the _features_string format:
- // There are jtreg tests checking the _features_string for various properties.
+void VM_Version::set_cpu_info_string() {
+ // A note on the _cpu_info_string format:
+ // There are jtreg tests checking the _cpu_info_string for various properties.
// For some strange reason, these tests require the string to contain
// only _lowercase_ characters. Keep that in mind when being surprised
// about the unusual notation of features - and when adding new ones.
@@ -412,29 +412,29 @@ void VM_Version::set_features_string() {
_model_string = "unknown model";
strcpy(buf, "z/Architecture (ambiguous detection)");
}
- _features_string = os::strdup(buf);
+ _cpu_info_string = os::strdup(buf);
if (has_Crypto_AES()) {
- assert(strlen(_features_string) + 3*8 < sizeof(buf), "increase buffer size");
+ assert(strlen(_cpu_info_string) + 3*8 < sizeof(buf), "increase buffer size");
jio_snprintf(buf, sizeof(buf), "%s%s%s%s",
- _features_string,
+ _cpu_info_string,
has_Crypto_AES128() ? ", aes128" : "",
has_Crypto_AES192() ? ", aes192" : "",
has_Crypto_AES256() ? ", aes256" : "");
- os::free((void *)_features_string);
- _features_string = os::strdup(buf);
+ os::free((void *)_cpu_info_string);
+ _cpu_info_string = os::strdup(buf);
}
if (has_Crypto_SHA()) {
- assert(strlen(_features_string) + 6 + 2*8 + 7 < sizeof(buf), "increase buffer size");
+ assert(strlen(_cpu_info_string) + 6 + 2*8 + 7 < sizeof(buf), "increase buffer size");
jio_snprintf(buf, sizeof(buf), "%s%s%s%s%s",
- _features_string,
+ _cpu_info_string,
has_Crypto_SHA1() ? ", sha1" : "",
has_Crypto_SHA256() ? ", sha256" : "",
has_Crypto_SHA512() ? ", sha512" : "",
has_Crypto_GHASH() ? ", ghash" : "");
- os::free((void *)_features_string);
- _features_string = os::strdup(buf);
+ os::free((void *)_cpu_info_string);
+ _cpu_info_string = os::strdup(buf);
}
}
@@ -464,7 +464,7 @@ bool VM_Version::test_feature_bit(unsigned long* featureBuffer, int featureNum,
}
void VM_Version::print_features_internal(const char* text, bool print_anyway) {
- tty->print_cr("%s %s", text, features_string());
+ tty->print_cr("%s %s", text, cpu_info_string());
tty->cr();
if (Verbose || print_anyway) {
@@ -906,7 +906,7 @@ void VM_Version::set_features_from(const char* march) {
err = true;
}
if (!err) {
- set_features_string();
+ set_cpu_info_string();
if (prt || PrintAssembly) {
print_features_internal("CPU Version as set by cmdline option:", prt);
}
@@ -1542,6 +1542,6 @@ void VM_Version::initialize_cpu_information(void) {
_no_of_threads = _no_of_cores;
_no_of_sockets = _no_of_cores;
snprintf(_cpu_name, CPU_TYPE_DESC_BUF_SIZE, "s390 %s", VM_Version::get_model_string());
- snprintf(_cpu_desc, CPU_DETAILED_DESC_BUF_SIZE, "s390 %s", features_string());
+ snprintf(_cpu_desc, CPU_DETAILED_DESC_BUF_SIZE, "s390 %s", cpu_info_string());
_initialized = true;
}
diff --git a/src/hotspot/cpu/s390/vm_version_s390.hpp b/src/hotspot/cpu/s390/vm_version_s390.hpp
index 49e6f5686f60a..6c6eb76bf7b03 100644
--- a/src/hotspot/cpu/s390/vm_version_s390.hpp
+++ b/src/hotspot/cpu/s390/vm_version_s390.hpp
@@ -148,7 +148,7 @@ class VM_Version: public Abstract_VM_Version {
static bool test_feature_bit(unsigned long* featureBuffer, int featureNum, unsigned int bufLen);
static int get_model_index();
- static void set_features_string();
+ static void set_cpu_info_string();
static void print_features_internal(const char* text, bool print_anyway=false);
static void determine_features();
static long call_getFeatures(unsigned long* buffer, int buflen, int functionCode);
diff --git a/src/hotspot/cpu/x86/assembler_x86.cpp b/src/hotspot/cpu/x86/assembler_x86.cpp
index 29e4fcee2f63a..7a4d7c6d6f340 100644
--- a/src/hotspot/cpu/x86/assembler_x86.cpp
+++ b/src/hotspot/cpu/x86/assembler_x86.cpp
@@ -24,6 +24,8 @@
#include "asm/assembler.hpp"
#include "asm/assembler.inline.hpp"
+#include "asm/codeBuffer.hpp"
+#include "code/codeCache.hpp"
#include "gc/shared/cardTableBarrierSet.hpp"
#include "interpreter/interpreter.hpp"
#include "memory/resourceArea.hpp"
@@ -119,8 +121,6 @@ AddressLiteral::AddressLiteral(address target, relocInfo::relocType rtype) {
// Implementation of Address
-#ifdef _LP64
-
Address Address::make_array(ArrayAddress adr) {
// Not implementable on 64bit machines
// Should have been handled higher up the call chain.
@@ -157,30 +157,6 @@ Address::Address(int disp, address loc, relocInfo::relocType rtype) {
ShouldNotReachHere();
}
}
-#else // LP64
-
-Address Address::make_array(ArrayAddress adr) {
- AddressLiteral base = adr.base();
- Address index = adr.index();
- assert(index._disp == 0, "must not have disp"); // maybe it can?
- Address array(index._base, index._index, index._scale, (intptr_t) base.target());
- array._rspec = base._rspec;
- return array;
-}
-
-// exceedingly dangerous constructor
-Address::Address(address loc, RelocationHolder spec) {
- _base = noreg;
- _index = noreg;
- _scale = no_scale;
- _disp = (intptr_t) loc;
- _rspec = spec;
- _xmmindex = xnoreg;
- _isxmmindex = false;
-}
-
-#endif // _LP64
-
// Convert the raw encoding form into the form expected by the constructor for
@@ -214,7 +190,6 @@ void Assembler::init_attributes(void) {
_legacy_mode_dq = (VM_Version::supports_avx512dq() == false);
_legacy_mode_vl = (VM_Version::supports_avx512vl() == false);
_legacy_mode_vlbw = (VM_Version::supports_avx512vlbw() == false);
- NOT_LP64(_is_managed = false;)
_attributes = nullptr;
}
@@ -744,8 +719,8 @@ void Assembler::emit_operand_helper(int reg_enc, int base_enc, int index_enc,
assert(inst_mark() != nullptr, "must be inside InstructionMark");
address next_ip = pc() + sizeof(int32_t) + post_addr_length;
int64_t adjusted = disp;
- // Do rip-rel adjustment for 64bit
- LP64_ONLY(adjusted -= (next_ip - inst_mark()));
+ // Do rip-rel adjustment
+ adjusted -= (next_ip - inst_mark());
assert(is_simm32(adjusted),
"must be 32bit offset (RIP relative address)");
emit_data((int32_t) adjusted, rspec, disp32_operand);
@@ -826,7 +801,7 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
address ip = inst;
bool is_64bit = false;
- debug_only(bool has_disp32 = false);
+ DEBUG_ONLY(bool has_disp32 = false);
int tail_size = 0; // other random bytes (#32, #16, etc.) at end of insn
again_after_prefix:
@@ -846,7 +821,7 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
case FS_segment:
case GS_segment:
// Seems dubious
- LP64_ONLY(assert(false, "shouldn't have that prefix"));
+ assert(false, "shouldn't have that prefix");
assert(ip == inst+1, "only one prefix allowed");
goto again_after_prefix;
@@ -859,11 +834,9 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
case REX_RB:
case REX_RX:
case REX_RXB:
- NOT_LP64(assert(false, "64bit prefixes"));
goto again_after_prefix;
case REX2:
- NOT_LP64(assert(false, "64bit prefixes"));
if ((0xFF & *ip++) & REX2BIT_W) {
is_64bit = true;
}
@@ -877,7 +850,6 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
case REX_WRB:
case REX_WRX:
case REX_WRXB:
- NOT_LP64(assert(false, "64bit prefixes"));
is_64bit = true;
goto again_after_prefix;
@@ -887,7 +859,7 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
case 0x8A: // movb r, a
case 0x8B: // movl r, a
case 0x8F: // popl a
- debug_only(has_disp32 = true);
+ DEBUG_ONLY(has_disp32 = true);
break;
case 0x68: // pushq #32
@@ -916,11 +888,9 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
case REX_WRB:
case REX_WRX:
case REX_WRXB:
- NOT_LP64(assert(false, "64bit prefix found"));
goto again_after_size_prefix2;
case REX2:
- NOT_LP64(assert(false, "64bit prefix found"));
if ((0xFF & *ip++) & REX2BIT_W) {
is_64bit = true;
}
@@ -928,10 +898,10 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
case 0x8B: // movw r, a
case 0x89: // movw a, r
- debug_only(has_disp32 = true);
+ DEBUG_ONLY(has_disp32 = true);
break;
case 0xC7: // movw a, #16
- debug_only(has_disp32 = true);
+ DEBUG_ONLY(has_disp32 = true);
tail_size = 2; // the imm16
break;
case 0x0F: // several SSE/SSE2 variants
@@ -945,20 +915,15 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
case REP8(0xB8): // movl/q r, #32/#64(oop?)
if (which == end_pc_operand) return ip + (is_64bit ? 8 : 4);
// these asserts are somewhat nonsensical
-#ifndef _LP64
- assert(which == imm_operand || which == disp32_operand,
- "which %d is_64_bit %d ip " INTPTR_FORMAT, which, is_64bit, p2i(ip));
-#else
assert(((which == call32_operand || which == imm_operand) && is_64bit) ||
(which == narrow_oop_operand && !is_64bit),
"which %d is_64_bit %d ip " INTPTR_FORMAT, which, is_64bit, p2i(ip));
-#endif // _LP64
return ip;
case 0x69: // imul r, a, #32
case 0xC7: // movl a, #32(oop?)
tail_size = 4;
- debug_only(has_disp32 = true); // has both kinds of operands!
+ DEBUG_ONLY(has_disp32 = true); // has both kinds of operands!
break;
case 0x0F: // movx..., etc.
@@ -967,11 +932,11 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
tail_size = 1;
case 0x38: // ptest, pmovzxbw
ip++; // skip opcode
- debug_only(has_disp32 = true); // has both kinds of operands!
+ DEBUG_ONLY(has_disp32 = true); // has both kinds of operands!
break;
case 0x70: // pshufd r, r/a, #8
- debug_only(has_disp32 = true); // has both kinds of operands!
+ DEBUG_ONLY(has_disp32 = true); // has both kinds of operands!
case 0x73: // psrldq r, #8
tail_size = 1;
break;
@@ -996,7 +961,7 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
case 0xAE: // ldmxcsr, stmxcsr, fxrstor, fxsave, clflush
case 0xD6: // movq
case 0xFE: // paddd
- debug_only(has_disp32 = true);
+ DEBUG_ONLY(has_disp32 = true);
break;
case 0xAD: // shrd r, a, %cl
@@ -1011,18 +976,18 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
case 0xC1: // xaddl
case 0xC7: // cmpxchg8
case REP16(0x90): // setcc a
- debug_only(has_disp32 = true);
+ DEBUG_ONLY(has_disp32 = true);
// fall out of the switch to decode the address
break;
case 0xC4: // pinsrw r, a, #8
- debug_only(has_disp32 = true);
+ DEBUG_ONLY(has_disp32 = true);
case 0xC5: // pextrw r, r, #8
tail_size = 1; // the imm8
break;
case 0xAC: // shrd r, a, #8
- debug_only(has_disp32 = true);
+ DEBUG_ONLY(has_disp32 = true);
tail_size = 1; // the imm8
break;
@@ -1039,12 +1004,12 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
// also: orl, adcl, sbbl, andl, subl, xorl, cmpl
// on 32bit in the case of cmpl, the imm might be an oop
tail_size = 4;
- debug_only(has_disp32 = true); // has both kinds of operands!
+ DEBUG_ONLY(has_disp32 = true); // has both kinds of operands!
break;
case 0x83: // addl a, #8; addl r, #8
// also: orl, adcl, sbbl, andl, subl, xorl, cmpl
- debug_only(has_disp32 = true); // has both kinds of operands!
+ DEBUG_ONLY(has_disp32 = true); // has both kinds of operands!
tail_size = 1;
break;
@@ -1061,7 +1026,7 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
case 0x9B:
switch (0xFF & *ip++) {
case 0xD9: // fnstcw a
- debug_only(has_disp32 = true);
+ DEBUG_ONLY(has_disp32 = true);
break;
default:
ShouldNotReachHere();
@@ -1080,7 +1045,7 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
case 0x87: // xchg r, a
case REP4(0x38): // cmp...
case 0x85: // test r, a
- debug_only(has_disp32 = true); // has both kinds of operands!
+ DEBUG_ONLY(has_disp32 = true); // has both kinds of operands!
break;
case 0xA8: // testb rax, #8
@@ -1092,7 +1057,7 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
case 0xC6: // movb a, #8
case 0x80: // cmpb a, #8
case 0x6B: // imul r, a, #8
- debug_only(has_disp32 = true); // has both kinds of operands!
+ DEBUG_ONLY(has_disp32 = true); // has both kinds of operands!
tail_size = 1; // the imm8
break;
@@ -1113,8 +1078,6 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
// to check for them in product version.
// Check second byte
- NOT_LP64(assert((0xC0 & *ip) == 0xC0, "shouldn't have LDS and LES instructions"));
-
int vex_opcode;
// First byte
if ((0xFF & *inst) == VEX_3bytes) {
@@ -1146,7 +1109,7 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
break;
}
ip++; // skip opcode
- debug_only(has_disp32 = true); // has both kinds of operands!
+ DEBUG_ONLY(has_disp32 = true); // has both kinds of operands!
break;
case 0x62: // EVEX_4bytes
@@ -1172,7 +1135,7 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
break;
}
ip++; // skip opcode
- debug_only(has_disp32 = true); // has both kinds of operands!
+ DEBUG_ONLY(has_disp32 = true); // has both kinds of operands!
break;
case 0xD1: // sal a, 1; sar a, 1; shl a, 1; shr a, 1
@@ -1184,7 +1147,7 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
case 0xD8: // fadd_s a; fsubr_s a; fmul_s a; fdivr_s a; fcomp_s a
case 0xDC: // fadd_d a; fsubr_d a; fmul_d a; fdivr_d a; fcomp_d a
case 0xDE: // faddp_d a; fsubrp_d a; fmulp_d a; fdivrp_d a; fcompp_d a
- debug_only(has_disp32 = true);
+ DEBUG_ONLY(has_disp32 = true);
break;
case 0xE8: // call rdisp32
@@ -1216,12 +1179,12 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
case REX_WRX:
case REX_WRXB:
case REX2:
- NOT_LP64(assert(false, "found 64bit prefix"));
ip++;
+ // fall-through
default:
ip++;
}
- debug_only(has_disp32 = true); // has both kinds of operands!
+ DEBUG_ONLY(has_disp32 = true); // has both kinds of operands!
break;
default:
@@ -1232,12 +1195,7 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
}
assert(which != call32_operand, "instruction is not a call, jmp, or jcc");
-#ifdef _LP64
assert(which != imm_operand, "instruction is not a movq reg, imm64");
-#else
- // assert(which != imm_operand || has_imm32, "instruction has no imm32 field");
- assert(which != imm_operand || has_disp32, "instruction has no imm32 field");
-#endif // LP64
assert(which != disp32_operand || has_disp32, "instruction has no disp32 field");
// parse the output of emit_operand
@@ -1292,11 +1250,7 @@ address Assembler::locate_operand(address inst, WhichOperand which) {
return ip + tail_size;
}
-#ifdef _LP64
assert(which == narrow_oop_operand && !is_64bit, "instruction is not a movl adr, imm32");
-#else
- assert(which == imm_operand, "instruction has only an imm field");
-#endif // LP64
return ip;
}
@@ -1319,8 +1273,7 @@ void Assembler::check_relocation(RelocationHolder const& rspec, int format) {
// assert(format == imm32_operand, "cannot specify a nonzero format");
opnd = locate_operand(inst, call32_operand);
} else if (r->is_data()) {
- assert(format == imm_operand || format == disp32_operand
- LP64_ONLY(|| format == narrow_oop_operand), "format ok");
+ assert(format == imm_operand || format == disp32_operand || format == narrow_oop_operand, "format ok");
opnd = locate_operand(inst, (WhichOperand)format);
} else {
assert(format == imm_operand, "cannot specify a format");
@@ -1549,7 +1502,6 @@ void Assembler::addr_nop_8() {
}
void Assembler::addsd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_F2, VEX_OPCODE_0F, &attributes);
@@ -1557,7 +1509,6 @@ void Assembler::addsd(XMMRegister dst, XMMRegister src) {
}
void Assembler::addsd(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_64bit);
@@ -1568,14 +1519,12 @@ void Assembler::addsd(XMMRegister dst, Address src) {
}
void Assembler::addss(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_F3, VEX_OPCODE_0F, &attributes);
emit_int16(0x58, (0xC0 | encode));
}
void Assembler::addss(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_32bit);
@@ -1834,9 +1783,6 @@ void Assembler::blsrl(Register dst, Address src) {
}
void Assembler::call(Label& L, relocInfo::relocType rtype) {
- // suspect disp32 is always good
- int operand = LP64_ONLY(disp32_operand) NOT_LP64(imm_operand);
-
if (L.is_bound()) {
const int long_size = 5;
int offs = (int)( target(L) - pc() );
@@ -1844,14 +1790,14 @@ void Assembler::call(Label& L, relocInfo::relocType rtype) {
InstructionMark im(this);
// 1110 1000 #32-bit disp
emit_int8((unsigned char)0xE8);
- emit_data(offs - long_size, rtype, operand);
+ emit_data(offs - long_size, rtype, disp32_operand);
} else {
InstructionMark im(this);
// 1110 1000 #32-bit disp
L.add_patch_at(code(), locator());
emit_int8((unsigned char)0xE8);
- emit_data(int(0), rtype, operand);
+ emit_data(int(0), rtype, disp32_operand);
}
}
@@ -1878,8 +1824,7 @@ void Assembler::call_literal(address entry, RelocationHolder const& rspec) {
// Technically, should use call32_operand, but this format is
// implied by the fact that we're emitting a call instruction.
- int operand = LP64_ONLY(disp32_operand) NOT_LP64(call32_operand);
- emit_data((int) disp, rspec, operand);
+ emit_data((int) disp, rspec, disp32_operand);
}
void Assembler::cdql() {
@@ -1891,7 +1836,6 @@ void Assembler::cld() {
}
void Assembler::cmovl(Condition cc, Register dst, Register src) {
- NOT_LP64(guarantee(VM_Version::supports_cmov(), "illegal instruction"));
int encode = prefix_and_encode(dst->encoding(), src->encoding(), true /* is_map1 */);
emit_opcode_prefix_and_encoding(0x40 | cc, 0xC0, encode);
}
@@ -1904,7 +1848,6 @@ void Assembler::ecmovl(Condition cc, Register dst, Register src1, Register src2)
void Assembler::cmovl(Condition cc, Register dst, Address src) {
InstructionMark im(this);
- NOT_LP64(guarantee(VM_Version::supports_cmov(), "illegal instruction"));
prefix(src, dst, false, true /* is_map1 */);
emit_int8((0x40 | cc));
emit_operand(dst, src, 0);
@@ -2032,7 +1975,6 @@ void Assembler::cmpxchgb(Register reg, Address adr) { // cmpxchg
void Assembler::comisd(XMMRegister dst, Address src) {
// NOTE: dbx seems to decode this as comiss even though the
// 0x66 is there. Strangely ucomisd comes out correct
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);;
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_64bit);
@@ -2043,7 +1985,6 @@ void Assembler::comisd(XMMRegister dst, Address src) {
}
void Assembler::comisd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, xnoreg, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -2051,7 +1992,6 @@ void Assembler::comisd(XMMRegister dst, XMMRegister src) {
}
void Assembler::comiss(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_32bit);
@@ -2061,7 +2001,6 @@ void Assembler::comiss(XMMRegister dst, Address src) {
}
void Assembler::comiss(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, xnoreg, src, VEX_SIMD_NONE, VEX_OPCODE_0F, &attributes);
emit_int16(0x2F, (0xC0 | encode));
@@ -2071,6 +2010,11 @@ void Assembler::cpuid() {
emit_int16(0x0F, (unsigned char)0xA2);
}
+void Assembler::serialize() {
+ assert(VM_Version::supports_serialize(), "");
+ emit_int24(0x0F, 0x01, 0xE8);
+}
+
// Opcode / Instruction Op / En 64 - Bit Mode Compat / Leg Mode Description Implemented
// F2 0F 38 F0 / r CRC32 r32, r / m8 RM Valid Valid Accumulate CRC32 on r / m8. v
// F2 REX 0F 38 F0 / r CRC32 r32, r / m8* RM Valid N.E. Accumulate CRC32 on r / m8. -
@@ -2099,8 +2043,7 @@ void Assembler::crc32(Register crc, Register v, int8_t sizeInBytes) {
case 2:
case 4:
break;
- LP64_ONLY(case 8:)
- // This instruction is not valid in 32 bits
+ case 8:
// Note:
// http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf
//
@@ -2119,7 +2062,7 @@ void Assembler::crc32(Register crc, Register v, int8_t sizeInBytes) {
assert(0, "Unsupported value for a sizeInBytes argument");
break;
}
- LP64_ONLY(prefix(crc, v, p);)
+ prefix(crc, v, p);
emit_int32(0x0F,
0x38,
0xF0 | w,
@@ -2148,22 +2091,20 @@ void Assembler::crc32(Register crc, Address adr, int8_t sizeInBytes) {
case 2:
case 4:
break;
- LP64_ONLY(case 8:)
- // This instruction is not valid in 32 bits
+ case 8:
p = REX_W;
break;
default:
assert(0, "Unsupported value for a sizeInBytes argument");
break;
}
- LP64_ONLY(prefix(crc, adr, p);)
+ prefix(crc, adr, p);
emit_int24(0x0F, 0x38, (0xF0 | w));
emit_operand(crc, adr, 0);
}
}
void Assembler::cvtdq2pd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, xnoreg, src, VEX_SIMD_F3, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xE6, (0xC0 | encode));
@@ -2226,7 +2167,6 @@ void Assembler::vcvtph2ps(XMMRegister dst, Address src, int vector_len) {
}
void Assembler::cvtdq2ps(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, xnoreg, src, VEX_SIMD_NONE, VEX_OPCODE_0F, &attributes);
emit_int16(0x5B, (0xC0 | encode));
@@ -2240,7 +2180,6 @@ void Assembler::vcvtdq2ps(XMMRegister dst, XMMRegister src, int vector_len) {
}
void Assembler::cvtsd2ss(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, src, src, VEX_SIMD_F2, VEX_OPCODE_0F, &attributes);
@@ -2248,7 +2187,6 @@ void Assembler::cvtsd2ss(XMMRegister dst, XMMRegister src) {
}
void Assembler::cvtsd2ss(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_64bit);
@@ -2259,14 +2197,12 @@ void Assembler::cvtsd2ss(XMMRegister dst, Address src) {
}
void Assembler::cvtsi2sdl(XMMRegister dst, Register src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, dst, as_XMMRegister(src->encoding()), VEX_SIMD_F2, VEX_OPCODE_0F, &attributes, true);
emit_int16(0x2A, (0xC0 | encode));
}
void Assembler::cvtsi2sdl(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_32bit);
@@ -2276,14 +2212,12 @@ void Assembler::cvtsi2sdl(XMMRegister dst, Address src) {
}
void Assembler::cvtsi2ssl(XMMRegister dst, Register src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, dst, as_XMMRegister(src->encoding()), VEX_SIMD_F3, VEX_OPCODE_0F, &attributes, true);
emit_int16(0x2A, (0xC0 | encode));
}
void Assembler::cvtsi2ssl(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_32bit);
@@ -2293,21 +2227,18 @@ void Assembler::cvtsi2ssl(XMMRegister dst, Address src) {
}
void Assembler::cvtsi2ssq(XMMRegister dst, Register src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, dst, as_XMMRegister(src->encoding()), VEX_SIMD_F3, VEX_OPCODE_0F, &attributes, true);
emit_int16(0x2A, (0xC0 | encode));
}
void Assembler::cvtss2sd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, src, src, VEX_SIMD_F3, VEX_OPCODE_0F, &attributes);
emit_int16(0x5A, (0xC0 | encode));
}
void Assembler::cvtss2sd(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_32bit);
@@ -2318,28 +2249,24 @@ void Assembler::cvtss2sd(XMMRegister dst, Address src) {
void Assembler::cvttsd2sil(Register dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(as_XMMRegister(dst->encoding()), xnoreg, src, VEX_SIMD_F2, VEX_OPCODE_0F, &attributes);
emit_int16(0x2C, (0xC0 | encode));
}
void Assembler::cvtss2sil(Register dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(as_XMMRegister(dst->encoding()), xnoreg, src, VEX_SIMD_F3, VEX_OPCODE_0F, &attributes);
emit_int16(0x2D, (0xC0 | encode));
}
void Assembler::cvttss2sil(Register dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(as_XMMRegister(dst->encoding()), xnoreg, src, VEX_SIMD_F3, VEX_OPCODE_0F, &attributes);
emit_int16(0x2C, (0xC0 | encode));
}
void Assembler::cvttpd2dq(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
int vector_len = VM_Version::supports_avx512novl() ? AVX_512bit : AVX_128bit;
InstructionAttr attributes(vector_len, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, xnoreg, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -2552,7 +2479,6 @@ void Assembler::edecl(Register dst, Address src, bool no_flags) {
}
void Assembler::divsd(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_64bit);
@@ -2563,7 +2489,6 @@ void Assembler::divsd(XMMRegister dst, Address src) {
}
void Assembler::divsd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_F2, VEX_OPCODE_0F, &attributes);
@@ -2571,7 +2496,6 @@ void Assembler::divsd(XMMRegister dst, XMMRegister src) {
}
void Assembler::divss(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_32bit);
@@ -2581,7 +2505,6 @@ void Assembler::divss(XMMRegister dst, Address src) {
}
void Assembler::divss(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_F3, VEX_OPCODE_0F, &attributes);
emit_int16(0x5E, (0xC0 | encode));
@@ -2853,7 +2776,6 @@ void Assembler::ldmxcsr( Address src) {
emit_int8((unsigned char)0xAE);
emit_operand(as_Register(2), src, 0);
} else {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionMark im(this);
prefix(src, true /* is_map1 */);
emit_int8((unsigned char)0xAE);
@@ -2868,7 +2790,6 @@ void Assembler::leal(Register dst, Address src) {
emit_operand(dst, src, 0);
}
-#ifdef _LP64
void Assembler::lea(Register dst, Label& L) {
emit_prefix_and_int8(get_prefixq(Address(), dst), (unsigned char)0x8D);
if (!L.is_bound()) {
@@ -2886,7 +2807,6 @@ void Assembler::lea(Register dst, Label& L) {
emit_int32(disp);
}
}
-#endif
void Assembler::lfence() {
emit_int24(0x0F, (unsigned char)0xAE, (unsigned char)0xE8);
@@ -2935,22 +2855,19 @@ void Assembler::elzcntl(Register dst, Address src, bool no_flags) {
// Emit mfence instruction
void Assembler::mfence() {
- NOT_LP64(assert(VM_Version::supports_sse2(), "unsupported");)
emit_int24(0x0F, (unsigned char)0xAE, (unsigned char)0xF0);
}
// Emit sfence instruction
void Assembler::sfence() {
- NOT_LP64(assert(VM_Version::supports_sse2(), "unsupported");)
emit_int24(0x0F, (unsigned char)0xAE, (unsigned char)0xF8);
}
void Assembler::mov(Register dst, Register src) {
- LP64_ONLY(movq(dst, src)) NOT_LP64(movl(dst, src));
+ movq(dst, src);
}
void Assembler::movapd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
int vector_len = VM_Version::supports_avx512novl() ? AVX_512bit : AVX_128bit;
InstructionAttr attributes(vector_len, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_rex_vex_w_reverted();
@@ -2959,7 +2876,6 @@ void Assembler::movapd(XMMRegister dst, XMMRegister src) {
}
void Assembler::movaps(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
int vector_len = VM_Version::supports_avx512novl() ? AVX_512bit : AVX_128bit;
InstructionAttr attributes(vector_len, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, xnoreg, src, VEX_SIMD_NONE, VEX_OPCODE_0F, &attributes);
@@ -2967,14 +2883,12 @@ void Assembler::movaps(XMMRegister dst, XMMRegister src) {
}
void Assembler::movlhps(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, src, src, VEX_SIMD_NONE, VEX_OPCODE_0F, &attributes);
emit_int16(0x16, (0xC0 | encode));
}
void Assembler::movb(Register dst, Address src) {
- NOT_LP64(assert(dst->has_byte_register(), "must have byte register"));
InstructionMark im(this);
prefix(src, dst, true);
emit_int8((unsigned char)0x8A);
@@ -3403,14 +3317,12 @@ void Assembler::movb(Address dst, Register src) {
}
void Assembler::movdl(XMMRegister dst, Register src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, xnoreg, as_XMMRegister(src->encoding()), VEX_SIMD_66, VEX_OPCODE_0F, &attributes, true);
emit_int16(0x6E, (0xC0 | encode));
}
void Assembler::movdl(Register dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
// swap src/dst to get correct prefix
int encode = simd_prefix_and_encode(src, xnoreg, as_XMMRegister(dst->encoding()), VEX_SIMD_66, VEX_OPCODE_0F, &attributes, true);
@@ -3418,7 +3330,6 @@ void Assembler::movdl(Register dst, XMMRegister src) {
}
void Assembler::movdl(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_32bit);
@@ -3428,7 +3339,6 @@ void Assembler::movdl(XMMRegister dst, Address src) {
}
void Assembler::movdl(Address dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_32bit);
@@ -3438,14 +3348,12 @@ void Assembler::movdl(Address dst, XMMRegister src) {
}
void Assembler::movdqa(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, xnoreg, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16(0x6F, (0xC0 | encode));
}
void Assembler::movdqa(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_address_attributes(/* tuple_type */ EVEX_FVM, /* input_size_in_bits */ EVEX_NObit);
@@ -3455,7 +3363,6 @@ void Assembler::movdqa(XMMRegister dst, Address src) {
}
void Assembler::movdqu(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_address_attributes(/* tuple_type */ EVEX_FVM, /* input_size_in_bits */ EVEX_NObit);
@@ -3465,14 +3372,12 @@ void Assembler::movdqu(XMMRegister dst, Address src) {
}
void Assembler::movdqu(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, xnoreg, src, VEX_SIMD_F3, VEX_OPCODE_0F, &attributes);
emit_int16(0x6F, (0xC0 | encode));
}
void Assembler::movdqu(Address dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_address_attributes(/* tuple_type */ EVEX_FVM, /* input_size_in_bits */ EVEX_NObit);
@@ -3917,7 +3822,6 @@ void Assembler::movl(Address dst, Register src) {
// when loading from memory. But for old Opteron use movlpd instead of movsd.
// The selection is done in MacroAssembler::movdbl() and movflt().
void Assembler::movlpd(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_64bit);
@@ -3928,7 +3832,6 @@ void Assembler::movlpd(XMMRegister dst, Address src) {
}
void Assembler::movq(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_64bit);
@@ -3939,7 +3842,6 @@ void Assembler::movq(XMMRegister dst, Address src) {
}
void Assembler::movq(Address dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_64bit);
@@ -3950,7 +3852,6 @@ void Assembler::movq(Address dst, XMMRegister src) {
}
void Assembler::movq(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(src, xnoreg, dst, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -3958,7 +3859,6 @@ void Assembler::movq(XMMRegister dst, XMMRegister src) {
}
void Assembler::movq(Register dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
// swap src/dst to get correct prefix
int encode = simd_prefix_and_encode(src, xnoreg, as_XMMRegister(dst->encoding()), VEX_SIMD_66, VEX_OPCODE_0F, &attributes, true);
@@ -3966,7 +3866,6 @@ void Assembler::movq(Register dst, XMMRegister src) {
}
void Assembler::movq(XMMRegister dst, Register src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, xnoreg, as_XMMRegister(src->encoding()), VEX_SIMD_66, VEX_OPCODE_0F, &attributes, true);
emit_int16(0x6E, (0xC0 | encode));
@@ -3980,13 +3879,11 @@ void Assembler::movsbl(Register dst, Address src) { // movsxb
}
void Assembler::movsbl(Register dst, Register src) { // movsxb
- NOT_LP64(assert(src->has_byte_register(), "must have byte register"));
int encode = prefix_and_encode(dst->encoding(), false, src->encoding(), true, true /* is_map1 */);
emit_opcode_prefix_and_encoding((unsigned char)0xBE, 0xC0, encode);
}
void Assembler::movsd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_F2, VEX_OPCODE_0F, &attributes);
@@ -3994,7 +3891,6 @@ void Assembler::movsd(XMMRegister dst, XMMRegister src) {
}
void Assembler::movsd(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_64bit);
@@ -4005,7 +3901,6 @@ void Assembler::movsd(XMMRegister dst, Address src) {
}
void Assembler::movsd(Address dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_64bit);
@@ -4025,14 +3920,12 @@ void Assembler::vmovsd(XMMRegister dst, XMMRegister src, XMMRegister src2) {
}
void Assembler::movss(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_F3, VEX_OPCODE_0F, &attributes);
emit_int16(0x10, (0xC0 | encode));
}
void Assembler::movss(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_32bit);
@@ -4042,7 +3935,6 @@ void Assembler::movss(XMMRegister dst, Address src) {
}
void Assembler::movss(Address dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_32bit);
@@ -4065,7 +3957,6 @@ void Assembler::movswl(Register dst, Register src) { // movsxw
}
void Assembler::movups(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_address_attributes(/* tuple_type */ EVEX_FVM, /* input_size_in_bits */ EVEX_NObit);
@@ -4085,7 +3976,6 @@ void Assembler::vmovups(XMMRegister dst, Address src, int vector_len) {
}
void Assembler::movups(Address dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_address_attributes(/* tuple_type */ EVEX_FVM, /* input_size_in_bits */ EVEX_NObit);
@@ -4138,7 +4028,6 @@ void Assembler::movzbl(Register dst, Address src) { // movzxb
}
void Assembler::movzbl(Register dst, Register src) { // movzxb
- NOT_LP64(assert(src->has_byte_register(), "must have byte register"));
int encode = prefix_and_encode(dst->encoding(), false, src->encoding(), true, true /* is_map1 */);
emit_opcode_prefix_and_encoding((unsigned char)0xB6, 0xC0, encode);
}
@@ -4183,7 +4072,6 @@ void Assembler::emull(Register src, bool no_flags) {
}
void Assembler::mulsd(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_64bit);
@@ -4194,7 +4082,6 @@ void Assembler::mulsd(XMMRegister dst, Address src) {
}
void Assembler::mulsd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_F2, VEX_OPCODE_0F, &attributes);
@@ -4202,7 +4089,6 @@ void Assembler::mulsd(XMMRegister dst, XMMRegister src) {
}
void Assembler::mulss(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_32bit);
@@ -4212,7 +4098,6 @@ void Assembler::mulss(XMMRegister dst, Address src) {
}
void Assembler::mulss(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_F3, VEX_OPCODE_0F, &attributes);
emit_int16(0x59, (0xC0 | encode));
@@ -4675,7 +4560,6 @@ void Assembler::eorb(Register dst, Address src1, Register src2, bool no_flags) {
}
void Assembler::packsswb(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16(0x63, (0xC0 | encode));
@@ -4689,7 +4573,6 @@ void Assembler::vpacksswb(XMMRegister dst, XMMRegister nds, XMMRegister src, int
}
void Assembler::packssdw(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), "");)
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16(0x6B, (0xC0 | encode));
@@ -4703,7 +4586,6 @@ void Assembler::vpackssdw(XMMRegister dst, XMMRegister nds, XMMRegister src, int
}
void Assembler::packuswb(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
assert((UseAVX > 0), "SSE mode requires address alignment 16 bytes");
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
@@ -4714,7 +4596,6 @@ void Assembler::packuswb(XMMRegister dst, Address src) {
}
void Assembler::packuswb(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16(0x67, (0xC0 | encode));
@@ -4895,7 +4776,6 @@ void Assembler::pcmpestri(XMMRegister dst, XMMRegister src, int imm8) {
// In this context, the dst vector contains the components that are equal, non equal components are zeroed in dst
void Assembler::pcmpeqb(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), "");)
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ true, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16(0x74, (0xC0 | encode));
@@ -5043,7 +4923,6 @@ void Assembler::evpcmpeqb(KRegister kdst, KRegister mask, XMMRegister nds, Addre
// In this context, the dst vector contains the components that are equal, non equal components are zeroed in dst
void Assembler::pcmpeqw(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), "");)
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ true, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16(0x75, (0xC0 | encode));
@@ -5092,7 +4971,6 @@ void Assembler::evpcmpeqw(KRegister kdst, XMMRegister nds, Address src, int vect
// In this context, the dst vector contains the components that are equal, non equal components are zeroed in dst
void Assembler::pcmpeqd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), "");)
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ true, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16(0x76, (0xC0 | encode));
@@ -5197,7 +5075,6 @@ void Assembler::pcmpgtq(XMMRegister dst, XMMRegister src) {
}
void Assembler::pmovmskb(Register dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), "");)
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ true, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(as_XMMRegister(dst->encoding()), xnoreg, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xD7, (0xC0 | encode));
@@ -5263,7 +5140,6 @@ void Assembler::pextrq(Address dst, XMMRegister src, int imm8) {
}
void Assembler::pextrw(Register dst, XMMRegister src, int imm8) {
- NOT_LP64(assert(VM_Version::supports_sse2(), "");)
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(as_XMMRegister(dst->encoding()), xnoreg, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int24((unsigned char)0xC5, (0xC0 | encode), imm8);
@@ -5349,14 +5225,12 @@ void Assembler::vpinsrq(XMMRegister dst, XMMRegister nds, Register src, int imm8
}
void Assembler::pinsrw(XMMRegister dst, Register src, int imm8) {
- NOT_LP64(assert(VM_Version::supports_sse2(), "");)
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, dst, as_XMMRegister(src->encoding()), VEX_SIMD_66, VEX_OPCODE_0F, &attributes, true);
emit_int24((unsigned char)0xC4, (0xC0 | encode), imm8);
}
void Assembler::pinsrw(XMMRegister dst, Address src, int imm8) {
- NOT_LP64(assert(VM_Version::supports_sse2(), "");)
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_16bit);
@@ -5666,7 +5540,6 @@ void Assembler::vpmovzxwq(XMMRegister dst, XMMRegister src, int vector_len) {
}
void Assembler::pmaddwd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xF5, (0xC0 | encode));
@@ -5882,18 +5755,7 @@ void Assembler::popf() {
emit_int8((unsigned char)0x9D);
}
-#ifndef _LP64 // no 32bit push/pop on amd64
-void Assembler::popl(Address dst) {
- // NOTE: this will adjust stack by 8byte on 64bits
- InstructionMark im(this);
- prefix(dst);
- emit_int8((unsigned char)0x8F);
- emit_operand(rax, dst, 0);
-}
-#endif
-
void Assembler::prefetchnta(Address src) {
- NOT_LP64(assert(VM_Version::supports_sse(), "must support"));
InstructionMark im(this);
prefix(src, true /* is_map1 */);
emit_int8(0x18);
@@ -5909,7 +5771,6 @@ void Assembler::prefetchr(Address src) {
}
void Assembler::prefetcht0(Address src) {
- NOT_LP64(assert(VM_Version::supports_sse(), "must support"));
InstructionMark im(this);
prefix(src, true /* is_map1 */);
emit_int8(0x18);
@@ -5917,7 +5778,6 @@ void Assembler::prefetcht0(Address src) {
}
void Assembler::prefetcht1(Address src) {
- NOT_LP64(assert(VM_Version::supports_sse(), "must support"));
InstructionMark im(this);
prefix(src, true /* is_map1 */);
emit_int8(0x18);
@@ -5925,7 +5785,6 @@ void Assembler::prefetcht1(Address src) {
}
void Assembler::prefetcht2(Address src) {
- NOT_LP64(assert(VM_Version::supports_sse(), "must support"));
InstructionMark im(this);
prefix(src, true /* is_map1 */);
emit_int8(0x18);
@@ -6002,7 +5861,6 @@ void Assembler::pshufb(XMMRegister dst, Address src) {
void Assembler::pshufd(XMMRegister dst, XMMRegister src, int mode) {
assert(isByte(mode), "invalid value");
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
int vector_len = VM_Version::supports_avx512novl() ? AVX_512bit : AVX_128bit;
InstructionAttr attributes(vector_len, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, xnoreg, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -6013,7 +5871,6 @@ void Assembler::vpshufd(XMMRegister dst, XMMRegister src, int mode, int vector_l
assert(vector_len == AVX_128bit? VM_Version::supports_avx() :
(vector_len == AVX_256bit? VM_Version::supports_avx2() :
(vector_len == AVX_512bit? VM_Version::supports_evex() : 0)), "");
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(vector_len, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, xnoreg, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int24(0x70, (0xC0 | encode), mode & 0xFF);
@@ -6021,7 +5878,6 @@ void Assembler::vpshufd(XMMRegister dst, XMMRegister src, int mode, int vector_l
void Assembler::pshufd(XMMRegister dst, Address src, int mode) {
assert(isByte(mode), "invalid value");
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
assert((UseAVX > 0), "SSE mode requires address alignment 16 bytes");
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
@@ -6034,7 +5890,6 @@ void Assembler::pshufd(XMMRegister dst, Address src, int mode) {
void Assembler::pshufhw(XMMRegister dst, XMMRegister src, int mode) {
assert(isByte(mode), "invalid value");
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, xnoreg, src, VEX_SIMD_F3, VEX_OPCODE_0F, &attributes);
emit_int24(0x70, (0xC0 | encode), mode & 0xFF);
@@ -6044,7 +5899,6 @@ void Assembler::vpshufhw(XMMRegister dst, XMMRegister src, int mode, int vector_
assert(vector_len == AVX_128bit ? VM_Version::supports_avx() :
(vector_len == AVX_256bit ? VM_Version::supports_avx2() :
(vector_len == AVX_512bit ? VM_Version::supports_avx512bw() : false)), "");
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(vector_len, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, xnoreg, src, VEX_SIMD_F3, VEX_OPCODE_0F, &attributes);
emit_int24(0x70, (0xC0 | encode), mode & 0xFF);
@@ -6052,7 +5906,6 @@ void Assembler::vpshufhw(XMMRegister dst, XMMRegister src, int mode, int vector_
void Assembler::pshuflw(XMMRegister dst, XMMRegister src, int mode) {
assert(isByte(mode), "invalid value");
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, xnoreg, src, VEX_SIMD_F2, VEX_OPCODE_0F, &attributes);
emit_int24(0x70, (0xC0 | encode), mode & 0xFF);
@@ -6060,7 +5913,6 @@ void Assembler::pshuflw(XMMRegister dst, XMMRegister src, int mode) {
void Assembler::pshuflw(XMMRegister dst, Address src, int mode) {
assert(isByte(mode), "invalid value");
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
assert((UseAVX > 0), "SSE mode requires address alignment 16 bytes");
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
@@ -6075,7 +5927,6 @@ void Assembler::vpshuflw(XMMRegister dst, XMMRegister src, int mode, int vector_
assert(vector_len == AVX_128bit ? VM_Version::supports_avx() :
(vector_len == AVX_256bit ? VM_Version::supports_avx2() :
(vector_len == AVX_512bit ? VM_Version::supports_avx512bw() : false)), "");
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(vector_len, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, xnoreg, src, VEX_SIMD_F2, VEX_OPCODE_0F, &attributes);
emit_int24(0x70, (0xC0 | encode), mode & 0xFF);
@@ -6092,7 +5943,6 @@ void Assembler::evshufi64x2(XMMRegister dst, XMMRegister nds, XMMRegister src, i
void Assembler::shufpd(XMMRegister dst, XMMRegister src, int imm8) {
assert(isByte(imm8), "invalid value");
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int24((unsigned char)0xC6, (0xC0 | encode), imm8 & 0xFF);
@@ -6107,7 +5957,6 @@ void Assembler::vshufpd(XMMRegister dst, XMMRegister nds, XMMRegister src, int i
void Assembler::shufps(XMMRegister dst, XMMRegister src, int imm8) {
assert(isByte(imm8), "invalid value");
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_NONE, VEX_OPCODE_0F, &attributes);
emit_int24((unsigned char)0xC6, (0xC0 | encode), imm8 & 0xFF);
@@ -6121,7 +5970,6 @@ void Assembler::vshufps(XMMRegister dst, XMMRegister nds, XMMRegister src, int i
void Assembler::psrldq(XMMRegister dst, int shift) {
// Shift left 128 bit value in dst XMMRegister by shift number of bytes.
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(xmm3, dst, dst, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int24(0x73, (0xC0 | encode), shift);
@@ -6138,7 +5986,6 @@ void Assembler::vpsrldq(XMMRegister dst, XMMRegister src, int shift, int vector_
void Assembler::pslldq(XMMRegister dst, int shift) {
// Shift left 128 bit value in dst XMMRegister by shift number of bytes.
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
// XMM7 is for /7 encoding: 66 0F 73 /7 ib
int encode = simd_prefix_and_encode(xmm7, dst, dst, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -6234,7 +6081,6 @@ void Assembler::evptestnmd(KRegister dst, XMMRegister nds, XMMRegister src, int
}
void Assembler::punpcklbw(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
assert((UseAVX > 0), "SSE mode requires address alignment 16 bytes");
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_vlbw, /* no_mask_reg */ true, /* uses_vl */ true);
@@ -6245,14 +6091,12 @@ void Assembler::punpcklbw(XMMRegister dst, Address src) {
}
void Assembler::punpcklbw(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_vlbw, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16(0x60, (0xC0 | encode));
}
void Assembler::punpckldq(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
assert((UseAVX > 0), "SSE mode requires address alignment 16 bytes");
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
@@ -6263,14 +6107,12 @@ void Assembler::punpckldq(XMMRegister dst, Address src) {
}
void Assembler::punpckldq(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16(0x62, (0xC0 | encode));
}
void Assembler::punpcklqdq(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -6313,7 +6155,6 @@ void Assembler::evpunpckhqdq(XMMRegister dst, KRegister mask, XMMRegister src1,
emit_int16(0x6D, (0xC0 | encode));
}
-#ifdef _LP64
void Assembler::push2(Register src1, Register src2, bool with_ppx) {
assert(VM_Version::supports_apx_f(), "requires APX");
InstructionAttr attributes(0, /* rex_w */ with_ppx, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
@@ -6375,8 +6216,6 @@ void Assembler::popp(Register dst) {
int encode = prefixq_and_encode_rex2(dst->encoding());
emit_int8((unsigned char)0x58 | encode);
}
-#endif //_LP64
-
void Assembler::push(int32_t imm32) {
// in 64bits we push 64bits onto the stack but only
@@ -6394,16 +6233,6 @@ void Assembler::pushf() {
emit_int8((unsigned char)0x9C);
}
-#ifndef _LP64 // no 32bit push/pop on amd64
-void Assembler::pushl(Address src) {
- // Note this will push 64bit on 64bit
- InstructionMark im(this);
- prefix(src);
- emit_int8((unsigned char)0xFF);
- emit_operand(rsi, src, 0);
-}
-#endif
-
void Assembler::rcll(Register dst, int imm8) {
assert(isShiftCount(imm8), "illegal shift count");
int encode = prefix_and_encode(dst->encoding());
@@ -6426,14 +6255,12 @@ void Assembler::ercll(Register dst, Register src, int imm8) {
}
void Assembler::rcpps(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ true, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, xnoreg, src, VEX_SIMD_NONE, VEX_OPCODE_0F, &attributes);
emit_int16(0x53, (0xC0 | encode));
}
void Assembler::rcpss(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ true, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_F3, VEX_OPCODE_0F, &attributes);
emit_int16(0x53, (0xC0 | encode));
@@ -6448,43 +6275,37 @@ void Assembler::rdtsc() {
void Assembler::rep_mov() {
// REP
// MOVSQ
- LP64_ONLY(emit_int24((unsigned char)0xF3, REX_W, (unsigned char)0xA5);)
- NOT_LP64( emit_int16((unsigned char)0xF3, (unsigned char)0xA5);)
+ emit_int24((unsigned char)0xF3, REX_W, (unsigned char)0xA5);
}
// sets rcx bytes with rax, value at [edi]
void Assembler::rep_stosb() {
// REP
// STOSB
- LP64_ONLY(emit_int24((unsigned char)0xF3, REX_W, (unsigned char)0xAA);)
- NOT_LP64( emit_int16((unsigned char)0xF3, (unsigned char)0xAA);)
+ emit_int24((unsigned char)0xF3, REX_W, (unsigned char)0xAA);
}
// sets rcx pointer sized words with rax, value at [edi]
// generic
void Assembler::rep_stos() {
// REP
- // LP64:STOSQ, LP32:STOSD
- LP64_ONLY(emit_int24((unsigned char)0xF3, REX_W, (unsigned char)0xAB);)
- NOT_LP64( emit_int16((unsigned char)0xF3, (unsigned char)0xAB);)
+ // STOSQ
+ emit_int24((unsigned char)0xF3, REX_W, (unsigned char)0xAB);
}
// scans rcx pointer sized words at [edi] for occurrence of rax,
// generic
void Assembler::repne_scan() { // repne_scan
// SCASQ
- LP64_ONLY(emit_int24((unsigned char)0xF2, REX_W, (unsigned char)0xAF);)
- NOT_LP64( emit_int16((unsigned char)0xF2, (unsigned char)0xAF);)
+ emit_int24((unsigned char)0xF2, REX_W, (unsigned char)0xAF);
}
-#ifdef _LP64
// scans rcx 4 byte words at [edi] for occurrence of rax,
// generic
void Assembler::repne_scanl() { // repne_scan
// SCASL
emit_int16((unsigned char)0xF2, (unsigned char)0xAF);
}
-#endif
void Assembler::ret(int imm16) {
if (imm16 == 0) {
@@ -6559,7 +6380,6 @@ void Assembler::erorl(Register dst, Register src, bool no_flags) {
emit_int16((unsigned char)0xD3, (0xC8 | encode));
}
-#ifdef _LP64
void Assembler::rorq(Register dst) {
int encode = prefixq_and_encode(dst->encoding());
emit_int16((unsigned char)0xD3, (0xC8 | encode));
@@ -6622,15 +6442,6 @@ void Assembler::erolq(Register dst, Register src, int imm8, bool no_flags) {
} else {
emit_int24((unsigned char)0xC1, (0xc0 | encode), imm8);
}
- }
-#endif
-
-void Assembler::sahf() {
-#ifdef _LP64
- // Not supported in 64bit mode
- ShouldNotReachHere();
-#endif
- emit_int8((unsigned char)0x9E);
}
void Assembler::sall(Address dst, int imm8) {
@@ -7083,7 +6894,6 @@ void Assembler::eshrdl(Register dst, Register src1, Register src2, int8_t imm8,
emit_int24(0x2C, (0xC0 | encode), imm8);
}
-#ifdef _LP64
void Assembler::shldq(Register dst, Register src, int8_t imm8) {
int encode = prefixq_and_encode(src->encoding(), dst->encoding(), true /* is_map1 */);
emit_opcode_prefix_and_encoding((unsigned char)0xA4, 0xC0, encode, imm8);
@@ -7109,7 +6919,6 @@ void Assembler::eshrdq(Register dst, Register src1, Register src2, int8_t imm8,
int encode = evex_prefix_and_encode_ndd(src2->encoding(), dst->encoding(), src1->encoding(), VEX_SIMD_NONE, /* MAP4 */VEX_OPCODE_0F_3C, &attributes, no_flags);
emit_int24(0x2C, (0xC0 | encode), imm8);
}
-#endif
// copies a single word from [esi] to [edi]
void Assembler::smovl() {
@@ -7134,7 +6943,6 @@ void Assembler::roundsd(XMMRegister dst, Address src, int32_t rmode) {
}
void Assembler::sqrtsd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_F2, VEX_OPCODE_0F, &attributes);
@@ -7142,7 +6950,6 @@ void Assembler::sqrtsd(XMMRegister dst, XMMRegister src) {
}
void Assembler::sqrtsd(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_64bit);
@@ -7153,7 +6960,6 @@ void Assembler::sqrtsd(XMMRegister dst, Address src) {
}
void Assembler::sqrtss(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_F3, VEX_OPCODE_0F, &attributes);
emit_int16(0x51, (0xC0 | encode));
@@ -7164,7 +6970,6 @@ void Assembler::std() {
}
void Assembler::sqrtss(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_32bit);
@@ -7185,7 +6990,6 @@ void Assembler::stmxcsr(Address dst) {
emit_int8((unsigned char)0xAE);
emit_operand(as_Register(3), dst, 0);
} else {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionMark im(this);
prefix(dst, true /* is_map1 */);
emit_int8((unsigned char)0xAE);
@@ -7276,7 +7080,6 @@ void Assembler::esubl(Register dst, Register src1, Register src2, bool no_flags)
}
void Assembler::subsd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_F2, VEX_OPCODE_0F, &attributes);
@@ -7284,7 +7087,6 @@ void Assembler::subsd(XMMRegister dst, XMMRegister src) {
}
void Assembler::subsd(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_64bit);
@@ -7295,14 +7097,12 @@ void Assembler::subsd(XMMRegister dst, Address src) {
}
void Assembler::subss(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true , /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_F3, VEX_OPCODE_0F, &attributes);
emit_int16(0x5C, (0xC0 | encode));
}
void Assembler::subss(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_32bit);
@@ -7312,7 +7112,6 @@ void Assembler::subss(XMMRegister dst, Address src) {
}
void Assembler::testb(Register dst, int imm8, bool use_ral) {
- NOT_LP64(assert(dst->has_byte_register(), "must have byte register"));
if (dst == rax) {
if (use_ral) {
emit_int8((unsigned char)0xA8);
@@ -7438,7 +7237,6 @@ void Assembler::etzcntq(Register dst, Address src, bool no_flags) {
}
void Assembler::ucomisd(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_64bit);
@@ -7449,7 +7247,6 @@ void Assembler::ucomisd(XMMRegister dst, Address src) {
}
void Assembler::ucomisd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, xnoreg, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -7457,7 +7254,6 @@ void Assembler::ucomisd(XMMRegister dst, XMMRegister src) {
}
void Assembler::ucomiss(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_32bit);
@@ -7467,7 +7263,6 @@ void Assembler::ucomiss(XMMRegister dst, Address src) {
}
void Assembler::ucomiss(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, xnoreg, src, VEX_SIMD_NONE, VEX_OPCODE_0F, &attributes);
emit_int16(0x2E, (0xC0 | encode));
@@ -7859,7 +7654,6 @@ void Assembler::vsubss(XMMRegister dst, XMMRegister nds, XMMRegister src) {
// Float-point vector arithmetic
void Assembler::addpd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -7867,7 +7661,6 @@ void Assembler::addpd(XMMRegister dst, XMMRegister src) {
}
void Assembler::addpd(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_rex_vex_w_reverted();
@@ -7879,7 +7672,6 @@ void Assembler::addpd(XMMRegister dst, Address src) {
void Assembler::addps(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_NONE, VEX_OPCODE_0F, &attributes);
emit_int16(0x58, (0xC0 | encode));
@@ -7922,7 +7714,6 @@ void Assembler::vaddps(XMMRegister dst, XMMRegister nds, Address src, int vector
}
void Assembler::subpd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -7930,7 +7721,6 @@ void Assembler::subpd(XMMRegister dst, XMMRegister src) {
}
void Assembler::subps(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_NONE, VEX_OPCODE_0F, &attributes);
emit_int16(0x5C, (0xC0 | encode));
@@ -7973,7 +7763,6 @@ void Assembler::vsubps(XMMRegister dst, XMMRegister nds, Address src, int vector
}
void Assembler::mulpd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -7981,7 +7770,6 @@ void Assembler::mulpd(XMMRegister dst, XMMRegister src) {
}
void Assembler::mulpd(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* vex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_address_attributes(/* tuple_type */ EVEX_FV, /* input_size_in_bits */ EVEX_NObit);
@@ -7992,7 +7780,6 @@ void Assembler::mulpd(XMMRegister dst, Address src) {
}
void Assembler::mulps(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_NONE, VEX_OPCODE_0F, &attributes);
emit_int16(0x59, (0xC0 | encode));
@@ -8069,7 +7856,6 @@ void Assembler::vfmadd231ps(XMMRegister dst, XMMRegister src1, Address src2, int
}
void Assembler::divpd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -8077,7 +7863,6 @@ void Assembler::divpd(XMMRegister dst, XMMRegister src) {
}
void Assembler::divps(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_NONE, VEX_OPCODE_0F, &attributes);
emit_int16(0x5E, (0xC0 | encode));
@@ -8211,7 +7996,6 @@ void Assembler::vsqrtps(XMMRegister dst, Address src, int vector_len) {
}
void Assembler::andpd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ !_legacy_mode_dq, /* legacy_mode */ _legacy_mode_dq, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -8219,7 +8003,6 @@ void Assembler::andpd(XMMRegister dst, XMMRegister src) {
}
void Assembler::andnpd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ !_legacy_mode_dq, /* legacy_mode */ _legacy_mode_dq, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -8227,14 +8010,12 @@ void Assembler::andnpd(XMMRegister dst, XMMRegister src) {
}
void Assembler::andps(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_dq, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_NONE, VEX_OPCODE_0F, &attributes);
emit_int16(0x54, (0xC0 | encode));
}
void Assembler::andps(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_dq, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_address_attributes(/* tuple_type */ EVEX_FV, /* input_size_in_bits */ EVEX_NObit);
@@ -8244,7 +8025,6 @@ void Assembler::andps(XMMRegister dst, Address src) {
}
void Assembler::andpd(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ !_legacy_mode_dq, /* legacy_mode */ _legacy_mode_dq, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_address_attributes(/* tuple_type */ EVEX_FV, /* input_size_in_bits */ EVEX_NObit);
@@ -8291,7 +8071,6 @@ void Assembler::vandps(XMMRegister dst, XMMRegister nds, Address src, int vector
}
void Assembler::unpckhpd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -8300,7 +8079,6 @@ void Assembler::unpckhpd(XMMRegister dst, XMMRegister src) {
}
void Assembler::unpcklpd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -8308,7 +8086,6 @@ void Assembler::unpcklpd(XMMRegister dst, XMMRegister src) {
}
void Assembler::xorpd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ !_legacy_mode_dq, /* legacy_mode */ _legacy_mode_dq, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -8316,14 +8093,12 @@ void Assembler::xorpd(XMMRegister dst, XMMRegister src) {
}
void Assembler::xorps(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_dq, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_NONE, VEX_OPCODE_0F, &attributes);
emit_int16(0x57, (0xC0 | encode));
}
void Assembler::xorpd(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ !_legacy_mode_dq, /* legacy_mode */ _legacy_mode_dq, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_address_attributes(/* tuple_type */ EVEX_FV, /* input_size_in_bits */ EVEX_NObit);
@@ -8334,7 +8109,6 @@ void Assembler::xorpd(XMMRegister dst, Address src) {
}
void Assembler::xorps(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_dq, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_address_attributes(/* tuple_type */ EVEX_FV, /* input_size_in_bits */ EVEX_NObit);
@@ -8397,28 +8171,24 @@ void Assembler::vphaddd(XMMRegister dst, XMMRegister nds, XMMRegister src, int v
}
void Assembler::paddb(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xFC, (0xC0 | encode));
}
void Assembler::paddw(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xFD, (0xC0 | encode));
}
void Assembler::paddd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xFE, (0xC0 | encode));
}
void Assembler::paddd(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
simd_prefix(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -8427,7 +8197,6 @@ void Assembler::paddd(XMMRegister dst, Address src) {
}
void Assembler::paddq(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -8738,14 +8507,12 @@ void Assembler::vpsubusw(XMMRegister dst, XMMRegister nds, Address src, int vect
void Assembler::psubb(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xF8, (0xC0 | encode));
}
void Assembler::psubw(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xF9, (0xC0 | encode));
@@ -8758,7 +8525,6 @@ void Assembler::psubd(XMMRegister dst, XMMRegister src) {
}
void Assembler::psubq(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -8837,7 +8603,6 @@ void Assembler::vpsubq(XMMRegister dst, XMMRegister nds, Address src, int vector
}
void Assembler::pmullw(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xD5, (0xC0 | encode));
@@ -8851,7 +8616,6 @@ void Assembler::pmulld(XMMRegister dst, XMMRegister src) {
}
void Assembler::pmuludq(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), "");)
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xF4, (0xC0 | encode));
@@ -8952,7 +8716,6 @@ void Assembler::vpminsb(XMMRegister dst, XMMRegister nds, XMMRegister src, int v
}
void Assembler::pminsw(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), "");)
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xEA, (0xC0 | encode));
@@ -8990,7 +8753,6 @@ void Assembler::vpminsq(XMMRegister dst, XMMRegister nds, XMMRegister src, int v
}
void Assembler::minps(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_NONE, VEX_OPCODE_0F, &attributes);
emit_int16(0x5D, (0xC0 | encode));
@@ -9003,7 +8765,6 @@ void Assembler::vminps(XMMRegister dst, XMMRegister nds, XMMRegister src, int ve
}
void Assembler::minpd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16(0x5D, (0xC0 | encode));
@@ -9031,7 +8792,6 @@ void Assembler::vpmaxsb(XMMRegister dst, XMMRegister nds, XMMRegister src, int v
}
void Assembler::pmaxsw(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), "");)
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xEE, (0xC0 | encode));
@@ -9069,7 +8829,6 @@ void Assembler::vpmaxsq(XMMRegister dst, XMMRegister nds, XMMRegister src, int v
}
void Assembler::maxps(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_NONE, VEX_OPCODE_0F, &attributes);
emit_int16(0x5F, (0xC0 | encode));
@@ -9083,7 +8842,6 @@ void Assembler::vmaxps(XMMRegister dst, XMMRegister nds, XMMRegister src, int ve
}
void Assembler::maxpd(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, xnoreg, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16(0x5F, (0xC0 | encode));
@@ -9438,7 +9196,6 @@ void Assembler::evpmaxuq(XMMRegister dst, KRegister mask, XMMRegister nds, Addre
// Shift packed integers left by specified number of bits.
void Assembler::psllw(XMMRegister dst, int shift) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
// XMM6 is for /6 encoding: 66 0F 71 /6 ib
int encode = simd_prefix_and_encode(xmm6, dst, dst, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -9446,7 +9203,6 @@ void Assembler::psllw(XMMRegister dst, int shift) {
}
void Assembler::pslld(XMMRegister dst, int shift) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
// XMM6 is for /6 encoding: 66 0F 72 /6 ib
int encode = simd_prefix_and_encode(xmm6, dst, dst, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -9454,7 +9210,6 @@ void Assembler::pslld(XMMRegister dst, int shift) {
}
void Assembler::psllq(XMMRegister dst, int shift) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
// XMM6 is for /6 encoding: 66 0F 73 /6 ib
int encode = simd_prefix_and_encode(xmm6, dst, dst, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -9462,21 +9217,18 @@ void Assembler::psllq(XMMRegister dst, int shift) {
}
void Assembler::psllw(XMMRegister dst, XMMRegister shift) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, shift, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xF1, (0xC0 | encode));
}
void Assembler::pslld(XMMRegister dst, XMMRegister shift) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, shift, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xF2, (0xC0 | encode));
}
void Assembler::psllq(XMMRegister dst, XMMRegister shift) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, shift, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -9493,7 +9245,6 @@ void Assembler::vpsllw(XMMRegister dst, XMMRegister src, int shift, int vector_l
void Assembler::vpslld(XMMRegister dst, XMMRegister src, int shift, int vector_len) {
assert(UseAVX > 0, "requires some form of AVX");
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
// XMM6 is for /6 encoding: 66 0F 72 /6 ib
int encode = vex_prefix_and_encode(xmm6->encoding(), dst->encoding(), src->encoding(), VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -9533,7 +9284,6 @@ void Assembler::vpsllq(XMMRegister dst, XMMRegister src, XMMRegister shift, int
// Shift packed integers logically right by specified number of bits.
void Assembler::psrlw(XMMRegister dst, int shift) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
// XMM2 is for /2 encoding: 66 0F 71 /2 ib
int encode = simd_prefix_and_encode(xmm2, dst, dst, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -9541,7 +9291,6 @@ void Assembler::psrlw(XMMRegister dst, int shift) {
}
void Assembler::psrld(XMMRegister dst, int shift) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
// XMM2 is for /2 encoding: 66 0F 72 /2 ib
int encode = simd_prefix_and_encode(xmm2, dst, dst, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -9551,7 +9300,6 @@ void Assembler::psrld(XMMRegister dst, int shift) {
void Assembler::psrlq(XMMRegister dst, int shift) {
// Do not confuse it with psrldq SSE2 instruction which
// shifts 128 bit value in xmm register by number of bytes.
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_rex_vex_w_reverted();
// XMM2 is for /2 encoding: 66 0F 73 /2 ib
@@ -9560,21 +9308,18 @@ void Assembler::psrlq(XMMRegister dst, int shift) {
}
void Assembler::psrlw(XMMRegister dst, XMMRegister shift) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, shift, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xD1, (0xC0 | encode));
}
void Assembler::psrld(XMMRegister dst, XMMRegister shift) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, shift, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xD2, (0xC0 | encode));
}
void Assembler::psrlq(XMMRegister dst, XMMRegister shift) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, shift, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -9646,7 +9391,6 @@ void Assembler::evpsllvw(XMMRegister dst, XMMRegister nds, XMMRegister src, int
// Shift packed integers arithmetically right by specified number of bits.
void Assembler::psraw(XMMRegister dst, int shift) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
// XMM4 is for /4 encoding: 66 0F 71 /4 ib
int encode = simd_prefix_and_encode(xmm4, dst, dst, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -9654,7 +9398,6 @@ void Assembler::psraw(XMMRegister dst, int shift) {
}
void Assembler::psrad(XMMRegister dst, int shift) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
// XMM4 is for /4 encoding: 66 0F 72 /4 ib
int encode = simd_prefix_and_encode(xmm4, dst, dst, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -9664,14 +9407,12 @@ void Assembler::psrad(XMMRegister dst, int shift) {
}
void Assembler::psraw(XMMRegister dst, XMMRegister shift) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, shift, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xE1, (0xC0 | encode));
}
void Assembler::psrad(XMMRegister dst, XMMRegister shift) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, shift, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xE2, (0xC0 | encode));
@@ -9727,7 +9468,6 @@ void Assembler::evpsraq(XMMRegister dst, XMMRegister src, XMMRegister shift, int
// logical operations packed integers
void Assembler::pand(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xDB, (0xC0 | encode));
@@ -9830,7 +9570,6 @@ void Assembler::vpshrdvd(XMMRegister dst, XMMRegister src, XMMRegister shift, in
}
void Assembler::pandn(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* vex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_rex_vex_w_reverted();
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
@@ -9845,7 +9584,6 @@ void Assembler::vpandn(XMMRegister dst, XMMRegister nds, XMMRegister src, int ve
}
void Assembler::por(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xEB, (0xC0 | encode));
@@ -9906,7 +9644,6 @@ void Assembler::evpord(XMMRegister dst, KRegister mask, XMMRegister nds, Address
}
void Assembler::pxor(XMMRegister dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F, &attributes);
emit_int16((unsigned char)0xEF, (0xC0 | encode));
@@ -12540,7 +12277,6 @@ void Assembler::evpternlogq(XMMRegister dst, int imm8, KRegister mask, XMMRegist
void Assembler::gf2p8affineqb(XMMRegister dst, XMMRegister src, int imm8) {
assert(VM_Version::supports_gfni(), "");
- NOT_LP64(assert(VM_Version::supports_sse(), "");)
InstructionAttr attributes(AVX_128bit, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, dst, src, VEX_SIMD_66, VEX_OPCODE_0F_3A, &attributes);
emit_int24((unsigned char)0xCE, (unsigned char)(0xC0 | encode), imm8);
@@ -12548,7 +12284,6 @@ void Assembler::gf2p8affineqb(XMMRegister dst, XMMRegister src, int imm8) {
void Assembler::vgf2p8affineqb(XMMRegister dst, XMMRegister src2, XMMRegister src3, int imm8, int vector_len) {
assert(VM_Version::supports_gfni(), "requires GFNI support");
- NOT_LP64(assert(VM_Version::supports_sse(), "");)
InstructionAttr attributes(vector_len, /* vex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
int encode = vex_prefix_and_encode(dst->encoding(), src2->encoding(), src3->encoding(), VEX_SIMD_66, VEX_OPCODE_0F_3A, &attributes);
emit_int24((unsigned char)0xCE, (unsigned char)(0xC0 | encode), imm8);
@@ -13040,428 +12775,6 @@ void Assembler::emit_farith(int b1, int b2, int i) {
emit_int16(b1, b2 + i);
}
-#ifndef _LP64
-// 32bit only pieces of the assembler
-
-void Assembler::emms() {
- NOT_LP64(assert(VM_Version::supports_mmx(), ""));
- emit_int16(0x0F, 0x77);
-}
-
-void Assembler::vzeroupper() {
- vzeroupper_uncached();
-}
-
-void Assembler::cmp_literal32(Register src1, int32_t imm32, RelocationHolder const& rspec) {
- // NO PREFIX AS NEVER 64BIT
- InstructionMark im(this);
- emit_int16((unsigned char)0x81, (0xF8 | src1->encoding()));
- emit_data(imm32, rspec, 0);
-}
-
-void Assembler::cmp_literal32(Address src1, int32_t imm32, RelocationHolder const& rspec) {
- // NO PREFIX AS NEVER 64BIT (not even 32bit versions of 64bit regs
- InstructionMark im(this);
- emit_int8((unsigned char)0x81);
- emit_operand(rdi, src1, 4);
- emit_data(imm32, rspec, 0);
-}
-
-// The 64-bit (32bit platform) cmpxchg compares the value at adr with the contents of rdx:rax,
-// and stores rcx:rbx into adr if so; otherwise, the value at adr is loaded
-// into rdx:rax. The ZF is set if the compared values were equal, and cleared otherwise.
-void Assembler::cmpxchg8(Address adr) {
- InstructionMark im(this);
- emit_int16(0x0F, (unsigned char)0xC7);
- emit_operand(rcx, adr, 0);
-}
-
-void Assembler::decl(Register dst) {
- // Don't use it directly. Use MacroAssembler::decrementl() instead.
- emit_int8(0x48 | dst->encoding());
-}
-
-void Assembler::edecl(Register dst, Register src, bool no_flags) {
- InstructionAttr attributes(AVX_128bit, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
- (void) evex_prefix_and_encode_ndd(0, dst->encoding(), src->encoding(), VEX_SIMD_NONE, VEX_OPCODE_0F_3C, &attributes, no_flags);
- emit_int8(0x48 | src->encoding());
-}
-
-// 64bit doesn't use the x87
-
-void Assembler::fabs() {
- emit_int16((unsigned char)0xD9, (unsigned char)0xE1);
-}
-
-void Assembler::fadd(int i) {
- emit_farith(0xD8, 0xC0, i);
-}
-
-void Assembler::fadd_d(Address src) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xDC);
- emit_operand32(rax, src, 0);
-}
-
-void Assembler::fadd_s(Address src) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xD8);
- emit_operand32(rax, src, 0);
-}
-
-void Assembler::fadda(int i) {
- emit_farith(0xDC, 0xC0, i);
-}
-
-void Assembler::faddp(int i) {
- emit_farith(0xDE, 0xC0, i);
-}
-
-void Assembler::fchs() {
- emit_int16((unsigned char)0xD9, (unsigned char)0xE0);
-}
-
-void Assembler::fcom(int i) {
- emit_farith(0xD8, 0xD0, i);
-}
-
-void Assembler::fcomp(int i) {
- emit_farith(0xD8, 0xD8, i);
-}
-
-void Assembler::fcomp_d(Address src) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xDC);
- emit_operand32(rbx, src, 0);
-}
-
-void Assembler::fcomp_s(Address src) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xD8);
- emit_operand32(rbx, src, 0);
-}
-
-void Assembler::fcompp() {
- emit_int16((unsigned char)0xDE, (unsigned char)0xD9);
-}
-
-void Assembler::fcos() {
- emit_int16((unsigned char)0xD9, (unsigned char)0xFF);
-}
-
-void Assembler::fdecstp() {
- emit_int16((unsigned char)0xD9, (unsigned char)0xF6);
-}
-
-void Assembler::fdiv(int i) {
- emit_farith(0xD8, 0xF0, i);
-}
-
-void Assembler::fdiv_d(Address src) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xDC);
- emit_operand32(rsi, src, 0);
-}
-
-void Assembler::fdiv_s(Address src) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xD8);
- emit_operand32(rsi, src, 0);
-}
-
-void Assembler::fdiva(int i) {
- emit_farith(0xDC, 0xF8, i);
-}
-
-// Note: The Intel manual (Pentium Processor User's Manual, Vol.3, 1994)
-// is erroneous for some of the floating-point instructions below.
-
-void Assembler::fdivp(int i) {
- emit_farith(0xDE, 0xF8, i); // ST(0) <- ST(0) / ST(1) and pop (Intel manual wrong)
-}
-
-void Assembler::fdivr(int i) {
- emit_farith(0xD8, 0xF8, i);
-}
-
-void Assembler::fdivr_d(Address src) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xDC);
- emit_operand32(rdi, src, 0);
-}
-
-void Assembler::fdivr_s(Address src) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xD8);
- emit_operand32(rdi, src, 0);
-}
-
-void Assembler::fdivra(int i) {
- emit_farith(0xDC, 0xF0, i);
-}
-
-void Assembler::fdivrp(int i) {
- emit_farith(0xDE, 0xF0, i); // ST(0) <- ST(1) / ST(0) and pop (Intel manual wrong)
-}
-
-void Assembler::ffree(int i) {
- emit_farith(0xDD, 0xC0, i);
-}
-
-void Assembler::fild_d(Address adr) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xDF);
- emit_operand32(rbp, adr, 0);
-}
-
-void Assembler::fild_s(Address adr) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xDB);
- emit_operand32(rax, adr, 0);
-}
-
-void Assembler::fincstp() {
- emit_int16((unsigned char)0xD9, (unsigned char)0xF7);
-}
-
-void Assembler::finit() {
- emit_int24((unsigned char)0x9B, (unsigned char)0xDB, (unsigned char)0xE3);
-}
-
-void Assembler::fist_s(Address adr) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xDB);
- emit_operand32(rdx, adr, 0);
-}
-
-void Assembler::fistp_d(Address adr) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xDF);
- emit_operand32(rdi, adr, 0);
-}
-
-void Assembler::fistp_s(Address adr) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xDB);
- emit_operand32(rbx, adr, 0);
-}
-
-void Assembler::fld1() {
- emit_int16((unsigned char)0xD9, (unsigned char)0xE8);
-}
-
-void Assembler::fld_s(Address adr) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xD9);
- emit_operand32(rax, adr, 0);
-}
-
-
-void Assembler::fld_s(int index) {
- emit_farith(0xD9, 0xC0, index);
-}
-
-void Assembler::fldcw(Address src) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xD9);
- emit_operand32(rbp, src, 0);
-}
-
-void Assembler::fldenv(Address src) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xD9);
- emit_operand32(rsp, src, 0);
-}
-
-void Assembler::fldlg2() {
- emit_int16((unsigned char)0xD9, (unsigned char)0xEC);
-}
-
-void Assembler::fldln2() {
- emit_int16((unsigned char)0xD9, (unsigned char)0xED);
-}
-
-void Assembler::fldz() {
- emit_int16((unsigned char)0xD9, (unsigned char)0xEE);
-}
-
-void Assembler::flog() {
- fldln2();
- fxch();
- fyl2x();
-}
-
-void Assembler::flog10() {
- fldlg2();
- fxch();
- fyl2x();
-}
-
-void Assembler::fmul(int i) {
- emit_farith(0xD8, 0xC8, i);
-}
-
-void Assembler::fmul_d(Address src) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xDC);
- emit_operand32(rcx, src, 0);
-}
-
-void Assembler::fmul_s(Address src) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xD8);
- emit_operand32(rcx, src, 0);
-}
-
-void Assembler::fmula(int i) {
- emit_farith(0xDC, 0xC8, i);
-}
-
-void Assembler::fmulp(int i) {
- emit_farith(0xDE, 0xC8, i);
-}
-
-void Assembler::fnsave(Address dst) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xDD);
- emit_operand32(rsi, dst, 0);
-}
-
-void Assembler::fnstcw(Address src) {
- InstructionMark im(this);
- emit_int16((unsigned char)0x9B, (unsigned char)0xD9);
- emit_operand32(rdi, src, 0);
-}
-
-void Assembler::fprem1() {
- emit_int16((unsigned char)0xD9, (unsigned char)0xF5);
-}
-
-void Assembler::frstor(Address src) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xDD);
- emit_operand32(rsp, src, 0);
-}
-
-void Assembler::fsin() {
- emit_int16((unsigned char)0xD9, (unsigned char)0xFE);
-}
-
-void Assembler::fsqrt() {
- emit_int16((unsigned char)0xD9, (unsigned char)0xFA);
-}
-
-void Assembler::fst_d(Address adr) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xDD);
- emit_operand32(rdx, adr, 0);
-}
-
-void Assembler::fst_s(Address adr) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xD9);
- emit_operand32(rdx, adr, 0);
-}
-
-void Assembler::fstp_s(Address adr) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xD9);
- emit_operand32(rbx, adr, 0);
-}
-
-void Assembler::fsub(int i) {
- emit_farith(0xD8, 0xE0, i);
-}
-
-void Assembler::fsub_d(Address src) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xDC);
- emit_operand32(rsp, src, 0);
-}
-
-void Assembler::fsub_s(Address src) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xD8);
- emit_operand32(rsp, src, 0);
-}
-
-void Assembler::fsuba(int i) {
- emit_farith(0xDC, 0xE8, i);
-}
-
-void Assembler::fsubp(int i) {
- emit_farith(0xDE, 0xE8, i); // ST(0) <- ST(0) - ST(1) and pop (Intel manual wrong)
-}
-
-void Assembler::fsubr(int i) {
- emit_farith(0xD8, 0xE8, i);
-}
-
-void Assembler::fsubr_d(Address src) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xDC);
- emit_operand32(rbp, src, 0);
-}
-
-void Assembler::fsubr_s(Address src) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xD8);
- emit_operand32(rbp, src, 0);
-}
-
-void Assembler::fsubra(int i) {
- emit_farith(0xDC, 0xE0, i);
-}
-
-void Assembler::fsubrp(int i) {
- emit_farith(0xDE, 0xE0, i); // ST(0) <- ST(1) - ST(0) and pop (Intel manual wrong)
-}
-
-void Assembler::ftan() {
- emit_int32((unsigned char)0xD9, (unsigned char)0xF2, (unsigned char)0xDD, (unsigned char)0xD8);
-}
-
-void Assembler::ftst() {
- emit_int16((unsigned char)0xD9, (unsigned char)0xE4);
-}
-
-void Assembler::fucomi(int i) {
- // make sure the instruction is supported (introduced for P6, together with cmov)
- guarantee(VM_Version::supports_cmov(), "illegal instruction");
- emit_farith(0xDB, 0xE8, i);
-}
-
-void Assembler::fucomip(int i) {
- // make sure the instruction is supported (introduced for P6, together with cmov)
- guarantee(VM_Version::supports_cmov(), "illegal instruction");
- emit_farith(0xDF, 0xE8, i);
-}
-
-void Assembler::fwait() {
- emit_int8((unsigned char)0x9B);
-}
-
-void Assembler::fxch(int i) {
- emit_farith(0xD9, 0xC8, i);
-}
-
-void Assembler::fyl2x() {
- emit_int16((unsigned char)0xD9, (unsigned char)0xF1);
-}
-
-void Assembler::frndint() {
- emit_int16((unsigned char)0xD9, (unsigned char)0xFC);
-}
-
-void Assembler::f2xm1() {
- emit_int16((unsigned char)0xD9, (unsigned char)0xF0);
-}
-
-void Assembler::fldl2e() {
- emit_int16((unsigned char)0xD9, (unsigned char)0xEA);
-}
-#endif // !_LP64
-
// SSE SIMD prefix byte values corresponding to VexSimdPrefix encoding.
static int simd_pre[4] = { 0, 0x66, 0xF3, 0xF2 };
// SSE opcode second byte values (first is 0x0F) corresponding to VexOpcode encoding.
@@ -13595,7 +12908,7 @@ void Assembler::vex_prefix(Address adr, int nds_enc, int xreg_enc, VexSimdPrefix
// is allowed in legacy mode and has resources which will fit in it.
// Pure EVEX instructions will have is_evex_instruction set in their definition.
if (!attributes->is_legacy_mode()) {
- if (UseAVX > 2 && !attributes->is_evex_instruction() && !is_managed()) {
+ if (UseAVX > 2 && !attributes->is_evex_instruction()) {
if ((attributes->get_vector_len() != AVX_512bit) && !is_extended) {
attributes->set_is_legacy_mode();
}
@@ -13610,7 +12923,6 @@ void Assembler::vex_prefix(Address adr, int nds_enc, int xreg_enc, VexSimdPrefix
assert((!is_extended || (!attributes->is_legacy_mode())),"XMM register should be 0-15");
}
- clear_managed();
if (UseAVX > 2 && !attributes->is_legacy_mode())
{
bool evex_r = (xreg_enc >= 16);
@@ -13658,7 +12970,7 @@ int Assembler::vex_prefix_and_encode(int dst_enc, int nds_enc, int src_enc, VexS
// is allowed in legacy mode and has resources which will fit in it.
// Pure EVEX instructions will have is_evex_instruction set in their definition.
if (!attributes->is_legacy_mode()) {
- if (UseAVX > 2 && !attributes->is_evex_instruction() && !is_managed()) {
+ if (UseAVX > 2 && !attributes->is_evex_instruction()) {
if ((!attributes->uses_vl() || (attributes->get_vector_len() != AVX_512bit)) &&
!is_extended) {
attributes->set_is_legacy_mode();
@@ -13680,7 +12992,6 @@ int Assembler::vex_prefix_and_encode(int dst_enc, int nds_enc, int src_enc, VexS
assert(((!is_extended) || (!attributes->is_legacy_mode())),"XMM register should be 0-15");
}
- clear_managed();
if (UseAVX > 2 && !attributes->is_legacy_mode())
{
bool evex_r = (dst_enc >= 16);
@@ -13810,6 +13121,18 @@ void Assembler::vcmpps(XMMRegister dst, XMMRegister nds, XMMRegister src, int co
emit_int24((unsigned char)0xC2, (0xC0 | encode), (unsigned char)comparison);
}
+void Assembler::evcmpph(KRegister kdst, KRegister mask, XMMRegister nds, XMMRegister src,
+ ComparisonPredicateFP comparison, int vector_len) {
+ assert(VM_Version::supports_avx512_fp16(), "");
+ assert(VM_Version::supports_avx512vl() || vector_len == Assembler::AVX_512bit, "");
+ InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ false, /* uses_vl */ true);
+ attributes.set_is_evex_instruction();
+ attributes.set_embedded_opmask_register_specifier(mask);
+ attributes.reset_is_clear_context();
+ int encode = vex_prefix_and_encode(kdst->encoding(), nds->encoding(), src->encoding(), VEX_SIMD_NONE, VEX_OPCODE_0F_3A, &attributes);
+ emit_int24((unsigned char)0xC2, (0xC0 | encode), comparison);
+}
+
void Assembler::evcmpsh(KRegister kdst, KRegister mask, XMMRegister nds, XMMRegister src, ComparisonPredicateFP comparison) {
assert(VM_Version::supports_avx512_fp16(), "");
InstructionAttr attributes(Assembler::AVX_128bit, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ false, /* uses_vl */ true);
@@ -13822,7 +13145,7 @@ void Assembler::evcmpsh(KRegister kdst, KRegister mask, XMMRegister nds, XMMRegi
void Assembler::evcmpps(KRegister kdst, KRegister mask, XMMRegister nds, XMMRegister src,
ComparisonPredicateFP comparison, int vector_len) {
- assert(VM_Version::supports_evex(), "");
+ assert(VM_Version::supports_avx512vl() || vector_len == Assembler::AVX_512bit, "");
// Encoding: EVEX.NDS.XXX.0F.W0 C2 /r ib
InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ false, /* uses_vl */ true);
attributes.set_is_evex_instruction();
@@ -13834,7 +13157,7 @@ void Assembler::evcmpps(KRegister kdst, KRegister mask, XMMRegister nds, XMMRegi
void Assembler::evcmppd(KRegister kdst, KRegister mask, XMMRegister nds, XMMRegister src,
ComparisonPredicateFP comparison, int vector_len) {
- assert(VM_Version::supports_evex(), "");
+ assert(VM_Version::supports_avx512vl() || vector_len == Assembler::AVX_512bit, "");
// Encoding: EVEX.NDS.XXX.66.0F.W1 C2 /r ib
InstructionAttr attributes(vector_len, /* vex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ false, /* uses_vl */ true);
attributes.set_is_evex_instruction();
@@ -14155,7 +13478,7 @@ void Assembler::vpblendvb(XMMRegister dst, XMMRegister nds, XMMRegister src, XMM
}
void Assembler::evblendmpd(XMMRegister dst, KRegister mask, XMMRegister nds, XMMRegister src, bool merge, int vector_len) {
- assert(VM_Version::supports_evex(), "");
+ assert(VM_Version::supports_avx512vl() || vector_len == Assembler::AVX_512bit, "");
// Encoding: EVEX.NDS.XXX.66.0F38.W1 65 /r
InstructionAttr attributes(vector_len, /* vex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ false, /* uses_vl */ true);
attributes.set_is_evex_instruction();
@@ -14168,7 +13491,7 @@ void Assembler::evblendmpd(XMMRegister dst, KRegister mask, XMMRegister nds, XMM
}
void Assembler::evblendmps(XMMRegister dst, KRegister mask, XMMRegister nds, XMMRegister src, bool merge, int vector_len) {
- assert(VM_Version::supports_evex(), "");
+ assert(VM_Version::supports_avx512vl() || vector_len == Assembler::AVX_512bit, "");
// Encoding: EVEX.NDS.XXX.66.0F38.W0 65 /r
InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ false, /* uses_vl */ true);
attributes.set_is_evex_instruction();
@@ -14180,9 +13503,9 @@ void Assembler::evblendmps(XMMRegister dst, KRegister mask, XMMRegister nds, XMM
emit_int16(0x65, (0xC0 | encode));
}
-void Assembler::evpblendmb (XMMRegister dst, KRegister mask, XMMRegister nds, XMMRegister src, bool merge, int vector_len) {
- assert(VM_Version::supports_evex(), "");
+void Assembler::evpblendmb(XMMRegister dst, KRegister mask, XMMRegister nds, XMMRegister src, bool merge, int vector_len) {
assert(VM_Version::supports_avx512bw(), "");
+ assert(VM_Version::supports_avx512vl() || vector_len == Assembler::AVX_512bit, "");
// Encoding: EVEX.NDS.512.66.0F38.W0 66 /r
InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ false, /* uses_vl */ true);
attributes.set_is_evex_instruction();
@@ -14194,9 +13517,9 @@ void Assembler::evpblendmb (XMMRegister dst, KRegister mask, XMMRegister nds, XM
emit_int16(0x66, (0xC0 | encode));
}
-void Assembler::evpblendmw (XMMRegister dst, KRegister mask, XMMRegister nds, XMMRegister src, bool merge, int vector_len) {
- assert(VM_Version::supports_evex(), "");
+void Assembler::evpblendmw(XMMRegister dst, KRegister mask, XMMRegister nds, XMMRegister src, bool merge, int vector_len) {
assert(VM_Version::supports_avx512bw(), "");
+ assert(VM_Version::supports_avx512vl() || vector_len == Assembler::AVX_512bit, "");
// Encoding: EVEX.NDS.512.66.0F38.W1 66 /r
InstructionAttr attributes(vector_len, /* vex_w */ true, /* legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ false, /* uses_vl */ true);
attributes.set_is_evex_instruction();
@@ -14208,8 +13531,8 @@ void Assembler::evpblendmw (XMMRegister dst, KRegister mask, XMMRegister nds, XM
emit_int16(0x66, (0xC0 | encode));
}
-void Assembler::evpblendmd (XMMRegister dst, KRegister mask, XMMRegister nds, XMMRegister src, bool merge, int vector_len) {
- assert(VM_Version::supports_evex(), "");
+void Assembler::evpblendmd(XMMRegister dst, KRegister mask, XMMRegister nds, XMMRegister src, bool merge, int vector_len) {
+ assert(VM_Version::supports_avx512vl() || vector_len == Assembler::AVX_512bit, "");
//Encoding: EVEX.NDS.512.66.0F38.W0 64 /r
InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ false, /* uses_vl */ true);
attributes.set_is_evex_instruction();
@@ -14221,8 +13544,8 @@ void Assembler::evpblendmd (XMMRegister dst, KRegister mask, XMMRegister nds, XM
emit_int16(0x64, (0xC0 | encode));
}
-void Assembler::evpblendmq (XMMRegister dst, KRegister mask, XMMRegister nds, XMMRegister src, bool merge, int vector_len) {
- assert(VM_Version::supports_evex(), "");
+void Assembler::evpblendmq(XMMRegister dst, KRegister mask, XMMRegister nds, XMMRegister src, bool merge, int vector_len) {
+ assert(VM_Version::supports_avx512vl() || vector_len == Assembler::AVX_512bit, "");
//Encoding: EVEX.NDS.512.66.0F38.W1 64 /r
InstructionAttr attributes(vector_len, /* vex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ false, /* uses_vl */ true);
attributes.set_is_evex_instruction();
@@ -14420,6 +13743,7 @@ void Assembler::shrxq(Register dst, Address src1, Register src2) {
void Assembler::evpmovq2m(KRegister dst, XMMRegister src, int vector_len) {
assert(VM_Version::supports_avx512vldq(), "");
+ assert(VM_Version::supports_avx512vl() || vector_len == Assembler::AVX_512bit, "");
InstructionAttr attributes(vector_len, /* vex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_is_evex_instruction();
int encode = vex_prefix_and_encode(dst->encoding(), 0, src->encoding(), VEX_SIMD_F3, VEX_OPCODE_0F_38, &attributes);
@@ -14428,6 +13752,7 @@ void Assembler::evpmovq2m(KRegister dst, XMMRegister src, int vector_len) {
void Assembler::evpmovd2m(KRegister dst, XMMRegister src, int vector_len) {
assert(VM_Version::supports_avx512vldq(), "");
+ assert(VM_Version::supports_avx512vl() || vector_len == Assembler::AVX_512bit, "");
InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_is_evex_instruction();
int encode = vex_prefix_and_encode(dst->encoding(), 0, src->encoding(), VEX_SIMD_F3, VEX_OPCODE_0F_38, &attributes);
@@ -14436,6 +13761,7 @@ void Assembler::evpmovd2m(KRegister dst, XMMRegister src, int vector_len) {
void Assembler::evpmovw2m(KRegister dst, XMMRegister src, int vector_len) {
assert(VM_Version::supports_avx512vlbw(), "");
+ assert(VM_Version::supports_avx512vl() || vector_len == Assembler::AVX_512bit, "");
InstructionAttr attributes(vector_len, /* vex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_is_evex_instruction();
int encode = vex_prefix_and_encode(dst->encoding(), 0, src->encoding(), VEX_SIMD_F3, VEX_OPCODE_0F_38, &attributes);
@@ -14444,6 +13770,7 @@ void Assembler::evpmovw2m(KRegister dst, XMMRegister src, int vector_len) {
void Assembler::evpmovb2m(KRegister dst, XMMRegister src, int vector_len) {
assert(VM_Version::supports_avx512vlbw(), "");
+ assert(VM_Version::supports_avx512vl() || vector_len == Assembler::AVX_512bit, "");
InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_is_evex_instruction();
int encode = vex_prefix_and_encode(dst->encoding(), 0, src->encoding(), VEX_SIMD_F3, VEX_OPCODE_0F_38, &attributes);
@@ -14452,6 +13779,7 @@ void Assembler::evpmovb2m(KRegister dst, XMMRegister src, int vector_len) {
void Assembler::evpmovm2q(XMMRegister dst, KRegister src, int vector_len) {
assert(VM_Version::supports_avx512vldq(), "");
+ assert(VM_Version::supports_avx512vl() || vector_len == Assembler::AVX_512bit, "");
InstructionAttr attributes(vector_len, /* vex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_is_evex_instruction();
int encode = vex_prefix_and_encode(dst->encoding(), 0, src->encoding(), VEX_SIMD_F3, VEX_OPCODE_0F_38, &attributes);
@@ -14460,6 +13788,7 @@ void Assembler::evpmovm2q(XMMRegister dst, KRegister src, int vector_len) {
void Assembler::evpmovm2d(XMMRegister dst, KRegister src, int vector_len) {
assert(VM_Version::supports_avx512vldq(), "");
+ assert(VM_Version::supports_avx512vl() || vector_len == Assembler::AVX_512bit, "");
InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_is_evex_instruction();
int encode = vex_prefix_and_encode(dst->encoding(), 0, src->encoding(), VEX_SIMD_F3, VEX_OPCODE_0F_38, &attributes);
@@ -14468,6 +13797,7 @@ void Assembler::evpmovm2d(XMMRegister dst, KRegister src, int vector_len) {
void Assembler::evpmovm2w(XMMRegister dst, KRegister src, int vector_len) {
assert(VM_Version::supports_avx512vlbw(), "");
+ assert(VM_Version::supports_avx512vl() || vector_len == Assembler::AVX_512bit, "");
InstructionAttr attributes(vector_len, /* vex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_is_evex_instruction();
int encode = vex_prefix_and_encode(dst->encoding(), 0, src->encoding(), VEX_SIMD_F3, VEX_OPCODE_0F_38, &attributes);
@@ -14476,6 +13806,7 @@ void Assembler::evpmovm2w(XMMRegister dst, KRegister src, int vector_len) {
void Assembler::evpmovm2b(XMMRegister dst, KRegister src, int vector_len) {
assert(VM_Version::supports_avx512vlbw(), "");
+ assert(VM_Version::supports_avx512vl() || vector_len == Assembler::AVX_512bit, "");
InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_is_evex_instruction();
int encode = vex_prefix_and_encode(dst->encoding(), 0, src->encoding(), VEX_SIMD_F3, VEX_OPCODE_0F_38, &attributes);
@@ -14560,55 +13891,6 @@ void Assembler::evcompresspd(XMMRegister dst, KRegister mask, XMMRegister src, b
emit_int16((unsigned char)0x8A, (0xC0 | encode));
}
-#ifndef _LP64
-
-void Assembler::incl(Register dst) {
- // Don't use it directly. Use MacroAssembler::incrementl() instead.
- emit_int8(0x40 | dst->encoding());
-}
-
-void Assembler::eincl(Register dst, Register src, bool no_flags) {
- InstructionAttr attributes(AVX_128bit, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
- (void) evex_prefix_and_encode_ndd(0, dst->encoding(), src->encoding(), VEX_SIMD_NONE, VEX_OPCODE_0F_3C, &attributes, no_flags);
- emit_int8(0x40 | src->encoding());
-}
-
-void Assembler::lea(Register dst, Address src) {
- leal(dst, src);
-}
-
-void Assembler::mov_literal32(Address dst, int32_t imm32, RelocationHolder const& rspec) {
- InstructionMark im(this);
- emit_int8((unsigned char)0xC7);
- emit_operand(rax, dst, 4);
- emit_data((int)imm32, rspec, 0);
-}
-
-void Assembler::mov_literal32(Register dst, int32_t imm32, RelocationHolder const& rspec) {
- InstructionMark im(this);
- int encode = prefix_and_encode(dst->encoding());
- emit_int8((0xB8 | encode));
- emit_data((int)imm32, rspec, 0);
-}
-
-void Assembler::popa() { // 32bit
- emit_int8(0x61);
-}
-
-void Assembler::push_literal32(int32_t imm32, RelocationHolder const& rspec) {
- InstructionMark im(this);
- emit_int8(0x68);
- emit_data(imm32, rspec, 0);
-}
-
-void Assembler::pusha() { // 32bit
- emit_int8(0x60);
-}
-
-#else // LP64
-
-// 64bit only pieces of the assembler
-
// This should only be used by 64bit instructions that can use rip-relative
// it cannot be used by instructions that want an immediate value.
@@ -15562,14 +14844,12 @@ void Assembler::cmpxchgq(Register reg, Address adr) {
}
void Assembler::cvtsi2sdq(XMMRegister dst, Register src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, dst, as_XMMRegister(src->encoding()), VEX_SIMD_F2, VEX_OPCODE_0F, &attributes, true);
emit_int16(0x2A, (0xC0 | encode));
}
void Assembler::cvtsi2sdq(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_64bit);
@@ -15579,7 +14859,6 @@ void Assembler::cvtsi2sdq(XMMRegister dst, Address src) {
}
void Assembler::cvtsi2ssq(XMMRegister dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionMark im(this);
InstructionAttr attributes(AVX_128bit, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
attributes.set_address_attributes(/* tuple_type */ EVEX_T1S, /* input_size_in_bits */ EVEX_64bit);
@@ -15589,7 +14868,6 @@ void Assembler::cvtsi2ssq(XMMRegister dst, Address src) {
}
void Assembler::cvttsd2siq(Register dst, Address src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
// F2 REX.W 0F 2C /r
// CVTTSD2SI r64, xmm1/m64
InstructionMark im(this);
@@ -15600,21 +14878,18 @@ void Assembler::cvttsd2siq(Register dst, Address src) {
}
void Assembler::cvttsd2siq(Register dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(as_XMMRegister(dst->encoding()), xnoreg, src, VEX_SIMD_F2, VEX_OPCODE_0F, &attributes);
emit_int16(0x2C, (0xC0 | encode));
}
void Assembler::cvtsd2siq(Register dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(as_XMMRegister(dst->encoding()), xnoreg, src, VEX_SIMD_F2, VEX_OPCODE_0F, &attributes);
emit_int16(0x2D, (0xC0 | encode));
}
void Assembler::cvttss2siq(Register dst, XMMRegister src) {
- NOT_LP64(assert(VM_Version::supports_sse(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(as_XMMRegister(dst->encoding()), xnoreg, src, VEX_SIMD_F3, VEX_OPCODE_0F, &attributes);
emit_int16(0x2C, (0xC0 | encode));
@@ -15960,16 +15235,12 @@ void Assembler::elzcntq(Register dst, Address src, bool no_flags) {
}
void Assembler::movdq(XMMRegister dst, Register src) {
- // table D-1 says MMX/SSE2
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
int encode = simd_prefix_and_encode(dst, xnoreg, as_XMMRegister(src->encoding()), VEX_SIMD_66, VEX_OPCODE_0F, &attributes, true);
emit_int16(0x6E, (0xC0 | encode));
}
void Assembler::movdq(Register dst, XMMRegister src) {
- // table D-1 says MMX/SSE2
- NOT_LP64(assert(VM_Version::supports_sse2(), ""));
InstructionAttr attributes(AVX_128bit, /* rex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ false);
// swap src/dst to get correct prefix
int encode = simd_prefix_and_encode(src, xnoreg, as_XMMRegister(dst->encoding()), VEX_SIMD_66, VEX_OPCODE_0F, &attributes, true);
@@ -16572,7 +15843,6 @@ void Assembler::rorxq(Register dst, Address src, int imm8) {
emit_int8(imm8);
}
-#ifdef _LP64
void Assembler::salq(Address dst, int imm8) {
InstructionMark im(this);
assert(isShiftCount(imm8 >> 1), "illegal shift count");
@@ -16727,7 +15997,6 @@ void Assembler::esarq(Register dst, Register src, bool no_flags) {
int encode = evex_prefix_and_encode_ndd(0, dst->encoding(), src->encoding(), VEX_SIMD_NONE, VEX_OPCODE_0F_3C, &attributes, no_flags);
emit_int16((unsigned char)0xD3, (0xF8 | encode));
}
-#endif
void Assembler::sbbq(Address dst, int32_t imm32) {
InstructionMark im(this);
@@ -17072,8 +16341,6 @@ void Assembler::exorq(Register dst, Address src1, Register src2, bool no_flags)
emit_operand(src2, src1, 0);
}
-#endif // !LP64
-
void InstructionAttr::set_address_attributes(int tuple_type, int input_size_in_bits) {
if (VM_Version::supports_evex()) {
_tuple_type = tuple_type;
@@ -17161,3 +16428,171 @@ void Assembler::evpermt2q(XMMRegister dst, XMMRegister nds, XMMRegister src, int
emit_int16(0x7E, (0xC0 | encode));
}
+void Assembler::evaddph(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) {
+ assert(VM_Version::supports_avx512_fp16(), "requires AVX512-FP16");
+ assert(vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl(), "");
+ InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
+ attributes.set_is_evex_instruction();
+ int encode = vex_prefix_and_encode(dst->encoding(), nds->encoding(), src->encoding(), VEX_SIMD_NONE, VEX_OPCODE_MAP5, &attributes);
+ emit_int16(0x58, (0xC0 | encode));
+}
+
+void Assembler::evaddph(XMMRegister dst, XMMRegister nds, Address src, int vector_len) {
+ assert(VM_Version::supports_avx512_fp16(), "requires AVX512-FP16");
+ assert(vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl(), "");
+ InstructionMark im(this);
+ InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
+ attributes.set_is_evex_instruction();
+ attributes.set_address_attributes(/* tuple_type */ EVEX_FVM, /* input_size_in_bits */ EVEX_NObit);
+ vex_prefix(src, nds->encoding(), dst->encoding(), VEX_SIMD_NONE, VEX_OPCODE_MAP5, &attributes);
+ emit_int8(0x58);
+ emit_operand(dst, src, 0);
+}
+
+void Assembler::evsubph(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) {
+ assert(VM_Version::supports_avx512_fp16(), "requires AVX512-FP16");
+ assert(vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl(), "");
+ InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
+ attributes.set_is_evex_instruction();
+ int encode = vex_prefix_and_encode(dst->encoding(), nds->encoding(), src->encoding(), VEX_SIMD_NONE, VEX_OPCODE_MAP5, &attributes);
+ emit_int16(0x5C, (0xC0 | encode));
+}
+
+void Assembler::evsubph(XMMRegister dst, XMMRegister nds, Address src, int vector_len) {
+ assert(VM_Version::supports_avx512_fp16(), "requires AVX512-FP16");
+ assert(vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl(), "");
+ InstructionMark im(this);
+ InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
+ attributes.set_is_evex_instruction();
+ attributes.set_address_attributes(/* tuple_type */ EVEX_FVM, /* input_size_in_bits */ EVEX_NObit);
+ vex_prefix(src, nds->encoding(), dst->encoding(), VEX_SIMD_NONE, VEX_OPCODE_MAP5, &attributes);
+ emit_int8(0x5C);
+ emit_operand(dst, src, 0);
+}
+
+void Assembler::evmulph(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) {
+ assert(VM_Version::supports_avx512_fp16(), "requires AVX512-FP16");
+ assert(vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl(), "");
+ InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
+ attributes.set_is_evex_instruction();
+ int encode = vex_prefix_and_encode(dst->encoding(), nds->encoding(), src->encoding(), VEX_SIMD_NONE, VEX_OPCODE_MAP5, &attributes);
+ emit_int16(0x59, (0xC0 | encode));
+}
+
+void Assembler::evmulph(XMMRegister dst, XMMRegister nds, Address src, int vector_len) {
+ assert(VM_Version::supports_avx512_fp16(), "requires AVX512-FP16");
+ assert(vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl(), "");
+ InstructionMark im(this);
+ InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
+ attributes.set_is_evex_instruction();
+ attributes.set_address_attributes(/* tuple_type */ EVEX_FVM, /* input_size_in_bits */ EVEX_NObit);
+ vex_prefix(src, nds->encoding(), dst->encoding(), VEX_SIMD_NONE, VEX_OPCODE_MAP5, &attributes);
+ emit_int8(0x59);
+ emit_operand(dst, src, 0);
+}
+
+void Assembler::evminph(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) {
+ assert(VM_Version::supports_avx512_fp16(), "requires AVX512-FP16");
+ assert(vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl(), "");
+ InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
+ attributes.set_is_evex_instruction();
+ int encode = vex_prefix_and_encode(dst->encoding(), nds->encoding(), src->encoding(), VEX_SIMD_NONE, VEX_OPCODE_MAP5, &attributes);
+ emit_int16(0x5D, (0xC0 | encode));
+}
+
+void Assembler::evminph(XMMRegister dst, XMMRegister nds, Address src, int vector_len) {
+ assert(VM_Version::supports_avx512_fp16(), "requires AVX512-FP16");
+ assert(vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl(), "");
+ InstructionMark im(this);
+ InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
+ attributes.set_is_evex_instruction();
+ attributes.set_address_attributes(/* tuple_type */ EVEX_FVM, /* input_size_in_bits */ EVEX_NObit);
+ vex_prefix(src, nds->encoding(), dst->encoding(), VEX_SIMD_NONE, VEX_OPCODE_MAP5, &attributes);
+ emit_int8(0x5D);
+ emit_operand(dst, src, 0);
+}
+
+void Assembler::evmaxph(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) {
+ assert(VM_Version::supports_avx512_fp16(), "requires AVX512-FP16");
+ assert(vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl(), "");
+ InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
+ attributes.set_is_evex_instruction();
+ int encode = vex_prefix_and_encode(dst->encoding(), nds->encoding(), src->encoding(), VEX_SIMD_NONE, VEX_OPCODE_MAP5, &attributes);
+ emit_int16(0x5F, (0xC0 | encode));
+}
+
+void Assembler::evmaxph(XMMRegister dst, XMMRegister nds, Address src, int vector_len) {
+ assert(VM_Version::supports_avx512_fp16(), "requires AVX512-FP16");
+ assert(vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl(), "");
+ InstructionMark im(this);
+ InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
+ attributes.set_is_evex_instruction();
+ attributes.set_address_attributes(/* tuple_type */ EVEX_FVM, /* input_size_in_bits */ EVEX_NObit);
+ vex_prefix(src, nds->encoding(), dst->encoding(), VEX_SIMD_NONE, VEX_OPCODE_MAP5, &attributes);
+ emit_int8(0x5F);
+ emit_operand(dst, src, 0);
+}
+
+void Assembler::evdivph(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) {
+ assert(VM_Version::supports_avx512_fp16(), "requires AVX512-FP16");
+ assert(vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl(), "");
+ InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
+ attributes.set_is_evex_instruction();
+ int encode = vex_prefix_and_encode(dst->encoding(), nds->encoding(), src->encoding(), VEX_SIMD_NONE, VEX_OPCODE_MAP5, &attributes);
+ emit_int16(0x5E, (0xC0 | encode));
+}
+
+void Assembler::evdivph(XMMRegister dst, XMMRegister nds, Address src, int vector_len) {
+ assert(VM_Version::supports_avx512_fp16(), "requires AVX512-FP16");
+ assert(vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl(), "");
+ InstructionMark im(this);
+ InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
+ attributes.set_is_evex_instruction();
+ attributes.set_address_attributes(/* tuple_type */ EVEX_FVM, /* input_size_in_bits */ EVEX_NObit);
+ vex_prefix(src, nds->encoding(), dst->encoding(), VEX_SIMD_NONE, VEX_OPCODE_MAP5, &attributes);
+ emit_int8(0x5E);
+ emit_operand(dst, src, 0);
+}
+
+void Assembler::evsqrtph(XMMRegister dst, XMMRegister src, int vector_len) {
+ assert(VM_Version::supports_avx512_fp16(), "");
+ assert(vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl(), "");
+ InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
+ attributes.set_is_evex_instruction();
+ int encode = vex_prefix_and_encode(dst->encoding(), 0, src->encoding(), VEX_SIMD_NONE, VEX_OPCODE_MAP5, &attributes);
+ emit_int16(0x51, (0xC0 | encode));
+}
+
+void Assembler::evsqrtph(XMMRegister dst, Address src, int vector_len) {
+ assert(VM_Version::supports_avx512_fp16(), "");
+ assert(vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl(), "");
+ InstructionMark im(this);
+ InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
+ attributes.set_is_evex_instruction();
+ attributes.set_address_attributes(/* tuple_type */ EVEX_FV, /* input_size_in_bits */ EVEX_NObit);
+ vex_prefix(src, 0, dst->encoding(), VEX_SIMD_NONE, VEX_OPCODE_MAP5, &attributes);
+ emit_int8(0x51);
+ emit_operand(dst, src, 0);
+}
+
+void Assembler::evfmadd132ph(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) {
+ assert(VM_Version::supports_avx512_fp16(), "");
+ assert(vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl(), "");
+ InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
+ attributes.set_is_evex_instruction();
+ int encode = vex_prefix_and_encode(dst->encoding(), nds->encoding(), src->encoding(), VEX_SIMD_66, VEX_OPCODE_MAP6, &attributes);
+ emit_int16(0x98, (0xC0 | encode));
+}
+
+void Assembler::evfmadd132ph(XMMRegister dst, XMMRegister nds, Address src, int vector_len) {
+ assert(VM_Version::supports_avx512_fp16(), "");
+ assert(vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl(), "");
+ InstructionMark im(this);
+ InstructionAttr attributes(vector_len, /* vex_w */ false, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
+ attributes.set_is_evex_instruction();
+ attributes.set_address_attributes(/* tuple_type */ EVEX_FV, /* input_size_in_bits */ EVEX_NObit);
+ vex_prefix(src, nds->encoding(), dst->encoding(), VEX_SIMD_66, VEX_OPCODE_MAP6, &attributes);
+ emit_int8(0x98);
+ emit_operand(dst, src, 0);
+}
+
diff --git a/src/hotspot/cpu/x86/assembler_x86.hpp b/src/hotspot/cpu/x86/assembler_x86.hpp
index 15ea45edb9159..719334701a5a6 100644
--- a/src/hotspot/cpu/x86/assembler_x86.hpp
+++ b/src/hotspot/cpu/x86/assembler_x86.hpp
@@ -35,7 +35,6 @@
class Argument {
public:
enum {
-#ifdef _LP64
#ifdef _WIN64
n_int_register_parameters_c = 4, // rcx, rdx, r8, r9 (c_rarg0, c_rarg1, ...)
n_float_register_parameters_c = 4, // xmm0 - xmm3 (c_farg0, c_farg1, ... )
@@ -49,16 +48,10 @@ class Argument {
#endif // _WIN64
n_int_register_parameters_j = 6, // j_rarg0, j_rarg1, ...
n_float_register_parameters_j = 8 // j_farg0, j_farg1, ...
-#else
- n_register_parameters = 0, // 0 registers used to pass arguments
- n_int_register_parameters_j = 0,
- n_float_register_parameters_j = 0
-#endif // _LP64
};
};
-#ifdef _LP64
// Symbolically name the register arguments used by the c calling convention.
// Windows is different from linux/solaris. So much for standards...
@@ -138,15 +131,6 @@ constexpr Register rscratch2 = r11; // volatile
constexpr Register r12_heapbase = r12; // callee-saved
constexpr Register r15_thread = r15; // callee-saved
-#else
-// rscratch1 will appear in 32bit code that is dead but of course must compile
-// Using noreg ensures if the dead code is incorrectly live and executed it
-// will cause an assertion failure
-#define rscratch1 noreg
-#define rscratch2 noreg
-
-#endif // _LP64
-
// JSR 292
// On x86, the SP does not have to be saved when invoking method handle intrinsics
// or compiled lambda forms. We indicate that by setting rbp_mh_SP_save to noreg.
@@ -168,7 +152,7 @@ class Address {
times_2 = 1,
times_4 = 2,
times_8 = 3,
- times_ptr = LP64_ONLY(times_8) NOT_LP64(times_4)
+ times_ptr = times_8
};
static ScaleFactor times(int size) {
assert(size >= 1 && size <= 8 && is_power_of_2(size), "bad scale size");
@@ -197,7 +181,6 @@ class Address {
// Easily misused constructors make them private
// %%% can we make these go away?
- NOT_LP64(Address(address loc, RelocationHolder spec);)
Address(int disp, address loc, relocInfo::relocType rtype);
Address(int disp, address loc, RelocationHolder spec);
@@ -456,7 +439,7 @@ class InstructionAttr;
// 64-bit reflect the fxsave size which is 512 bytes and the new xsave area on EVEX which is another 2176 bytes
// See fxsave and xsave(EVEX enabled) documentation for layout
-const int FPUStateSizeInWords = NOT_LP64(27) LP64_ONLY(2688 / wordSize);
+const int FPUStateSizeInWords = 2688 / wordSize;
// The Intel x86/Amd64 Assembler: Pure assembler doing NO optimizations on the instruction
// level (e.g. mov rax, 0 is not translated into xor rax, rax!); i.e., what you write
@@ -628,12 +611,8 @@ class Assembler : public AbstractAssembler {
imm_operand = 0, // embedded 32-bit|64-bit immediate operand
disp32_operand = 1, // embedded 32-bit displacement or address
call32_operand = 2, // embedded 32-bit self-relative displacement
-#ifndef _LP64
- _WhichOperand_limit = 3
-#else
- narrow_oop_operand = 3, // embedded 32-bit immediate narrow oop
+ narrow_oop_operand = 3, // embedded 32-bit immediate narrow oop
_WhichOperand_limit = 4
-#endif
};
// Comparison predicates for integral types & FP types when using SSE
@@ -721,7 +700,6 @@ class Assembler : public AbstractAssembler {
bool _legacy_mode_dq;
bool _legacy_mode_vl;
bool _legacy_mode_vlbw;
- NOT_LP64(bool _is_managed;)
InstructionAttr *_attributes;
void set_attributes(InstructionAttr* attributes);
@@ -907,25 +885,13 @@ class Assembler : public AbstractAssembler {
void emit_opcode_prefix_and_encoding(int byte1, int ocp_and_encoding);
void emit_opcode_prefix_and_encoding(int byte1, int byte2, int ocp_and_encoding);
void emit_opcode_prefix_and_encoding(int byte1, int byte2, int ocp_and_encoding, int byte3);
- bool always_reachable(AddressLiteral adr) NOT_LP64( { return true; } );
- bool reachable(AddressLiteral adr) NOT_LP64( { return true; } );
+ bool always_reachable(AddressLiteral adr);
+ bool reachable(AddressLiteral adr);
// These are all easily abused and hence protected
public:
- // 32BIT ONLY SECTION
-#ifndef _LP64
- // Make these disappear in 64bit mode since they would never be correct
- void cmp_literal32(Register src1, int32_t imm32, RelocationHolder const& rspec); // 32BIT ONLY
- void cmp_literal32(Address src1, int32_t imm32, RelocationHolder const& rspec); // 32BIT ONLY
-
- void mov_literal32(Register dst, int32_t imm32, RelocationHolder const& rspec); // 32BIT ONLY
- void mov_literal32(Address dst, int32_t imm32, RelocationHolder const& rspec); // 32BIT ONLY
-
- void push_literal32(int32_t imm32, RelocationHolder const& rspec); // 32BIT ONLY
-#else
- // 64BIT ONLY SECTION
void mov_literal64(Register dst, intptr_t imm64, RelocationHolder const& rspec); // 64BIT ONLY
void cmp_narrow_oop(Register src1, int32_t imm32, RelocationHolder const& rspec);
@@ -933,7 +899,6 @@ class Assembler : public AbstractAssembler {
void mov_narrow_oop(Register dst, int32_t imm32, RelocationHolder const& rspec);
void mov_narrow_oop(Address dst, int32_t imm32, RelocationHolder const& rspec);
-#endif // _LP64
protected:
// These are unique in that we are ensured by the caller that the 32bit
@@ -1017,17 +982,10 @@ class Assembler : public AbstractAssembler {
void init_attributes(void);
void clear_attributes(void) { _attributes = nullptr; }
- void set_managed(void) { NOT_LP64(_is_managed = true;) }
- void clear_managed(void) { NOT_LP64(_is_managed = false;) }
- bool is_managed(void) {
- NOT_LP64(return _is_managed;)
- LP64_ONLY(return false;) }
-
void lea(Register dst, Address src);
void mov(Register dst, Register src);
-#ifdef _LP64
// support caching the result of some routines
// must be called before pusha(), popa(), vzeroupper() - checked with asserts
@@ -1047,7 +1005,6 @@ class Assembler : public AbstractAssembler {
// New Zero Upper setcc instruction.
void esetzucc(Condition cc, Register dst);
-#endif
void vzeroupper_uncached();
void decq(Register dst);
void edecq(Register dst, Register src, bool no_flags);
@@ -1069,9 +1026,7 @@ class Assembler : public AbstractAssembler {
void rep_stos();
void rep_stosb();
void repne_scan();
-#ifdef _LP64
void repne_scanl();
-#endif
// Vanilla instructions in lexical order
@@ -1121,7 +1076,6 @@ class Assembler : public AbstractAssembler {
void eincq(Register dst, Register src, bool no_flags);
void eincq(Register dst, Address src, bool no_flags);
-#ifdef _LP64
//Add Unsigned Integers with Carry Flag
void adcxq(Register dst, Register src);
void eadcxq(Register dst, Register src1, Register src2);
@@ -1129,7 +1083,6 @@ class Assembler : public AbstractAssembler {
//Add Unsigned Integers with Overflow Flag
void adoxq(Register dst, Register src);
void eadoxq(Register dst, Register src1, Register src2);
-#endif
void addr_nop_4();
void addr_nop_5();
@@ -1206,10 +1159,8 @@ class Assembler : public AbstractAssembler {
void bsfl(Register dst, Register src);
void bsrl(Register dst, Register src);
-#ifdef _LP64
void bsfq(Register dst, Register src);
void bsrq(Register dst, Register src);
-#endif
void bswapl(Register reg);
@@ -1281,6 +1232,9 @@ class Assembler : public AbstractAssembler {
// Identify processor type and features
void cpuid();
+ // Serialize instruction stream
+ void serialize();
+
// CRC32C
void crc32(Register crc, Register v, int8_t sizeInBytes);
void crc32(Register crc, Address adr, int8_t sizeInBytes);
@@ -1395,139 +1349,6 @@ class Assembler : public AbstractAssembler {
void emit_farith(int b1, int b2, int i);
public:
-#ifndef _LP64
- void emms();
-
- void fabs();
-
- void fadd(int i);
-
- void fadd_d(Address src);
- void fadd_s(Address src);
-
- // "Alternate" versions of x87 instructions place result down in FPU
- // stack instead of on TOS
-
- void fadda(int i); // "alternate" fadd
- void faddp(int i = 1);
-
- void fchs();
-
- void fcom(int i);
-
- void fcomp(int i = 1);
- void fcomp_d(Address src);
- void fcomp_s(Address src);
-
- void fcompp();
-
- void fcos();
-
- void fdecstp();
-
- void fdiv(int i);
- void fdiv_d(Address src);
- void fdivr_s(Address src);
- void fdiva(int i); // "alternate" fdiv
- void fdivp(int i = 1);
-
- void fdivr(int i);
- void fdivr_d(Address src);
- void fdiv_s(Address src);
-
- void fdivra(int i); // "alternate" reversed fdiv
-
- void fdivrp(int i = 1);
-
- void ffree(int i = 0);
-
- void fild_d(Address adr);
- void fild_s(Address adr);
-
- void fincstp();
-
- void finit();
-
- void fist_s (Address adr);
- void fistp_d(Address adr);
- void fistp_s(Address adr);
-
- void fld1();
-
- void fld_s(Address adr);
- void fld_s(int index);
-
- void fldcw(Address src);
-
- void fldenv(Address src);
-
- void fldlg2();
-
- void fldln2();
-
- void fldz();
-
- void flog();
- void flog10();
-
- void fmul(int i);
-
- void fmul_d(Address src);
- void fmul_s(Address src);
-
- void fmula(int i); // "alternate" fmul
-
- void fmulp(int i = 1);
-
- void fnsave(Address dst);
-
- void fnstcw(Address src);
- void fprem1();
-
- void frstor(Address src);
-
- void fsin();
-
- void fsqrt();
-
- void fst_d(Address adr);
- void fst_s(Address adr);
-
- void fstp_s(Address adr);
-
- void fsub(int i);
- void fsub_d(Address src);
- void fsub_s(Address src);
-
- void fsuba(int i); // "alternate" fsub
-
- void fsubp(int i = 1);
-
- void fsubr(int i);
- void fsubr_d(Address src);
- void fsubr_s(Address src);
-
- void fsubra(int i); // "alternate" reversed fsub
-
- void fsubrp(int i = 1);
-
- void ftan();
-
- void ftst();
-
- void fucomi(int i = 1);
- void fucomip(int i = 1);
-
- void fwait();
-
- void fxch(int i = 1);
-
- void fyl2x();
- void frndint();
- void f2xm1();
- void fldl2e();
-#endif // !_LP64
-
// operands that only take the original 32bit registers
void emit_operand32(Register reg, Address adr, int post_addr_length);
@@ -1546,12 +1367,10 @@ class Assembler : public AbstractAssembler {
void divl(Register src); // Unsigned division
void edivl(Register src, bool no_flags); // Unsigned division
-#ifdef _LP64
void idivq(Register src);
void eidivq(Register src, bool no_flags);
void divq(Register src); // Unsigned division
void edivq(Register src, bool no_flags); // Unsigned division
-#endif
void imull(Register src);
void eimull(Register src, bool no_flags);
@@ -1564,7 +1383,6 @@ class Assembler : public AbstractAssembler {
void imull(Register dst, Address src);
void eimull(Register dst, Register src1, Address src2, bool no_flags);
-#ifdef _LP64
void imulq(Register dst, Register src);
void eimulq(Register dst, Register src, bool no_flags);
void eimulq(Register dst, Register src1, Register src2, bool no_flags);
@@ -1577,7 +1395,6 @@ class Assembler : public AbstractAssembler {
void eimulq(Register dst, Register src1, Address src2, bool no_flags);
void imulq(Register dst);
void eimulq(Register dst, bool no_flags);
-#endif
// jcc is the generic conditional branch generator to run-
// time routines, jcc is used for branches to labels. jcc
@@ -1629,9 +1446,7 @@ class Assembler : public AbstractAssembler {
void leaq(Register dst, Address src);
-#ifdef _LP64
void lea(Register dst, Label& L);
-#endif
void lfence();
@@ -1643,12 +1458,10 @@ class Assembler : public AbstractAssembler {
void lzcntl(Register dst, Address src);
void elzcntl(Register dst, Address src, bool no_flags);
-#ifdef _LP64
void lzcntq(Register dst, Register src);
void elzcntq(Register dst, Register src, bool no_flags);
void lzcntq(Register dst, Address src);
void elzcntq(Register dst, Address src, bool no_flags);
-#endif
enum Membar_mask_bits {
StoreStore = 1 << 3,
@@ -1808,13 +1621,11 @@ class Assembler : public AbstractAssembler {
void movl(Register dst, Address src);
void movl(Address dst, Register src);
-#ifdef _LP64
void movq(Register dst, Register src);
void movq(Register dst, Address src);
void movq(Address dst, Register src);
void movq(Address dst, int32_t imm32);
void movq(Register dst, int32_t imm32);
-#endif
// Move Quadword
void movq(Address dst, XMMRegister src);
@@ -1829,7 +1640,6 @@ class Assembler : public AbstractAssembler {
void vmovw(XMMRegister dst, Register src);
void vmovw(Register dst, XMMRegister src);
-#ifdef _LP64
void movsbq(Register dst, Address src);
void movsbq(Register dst, Register src);
@@ -1838,15 +1648,12 @@ class Assembler : public AbstractAssembler {
void movslq(Register dst, Address src);
void movslq(Register dst, Register src);
-#endif
void movswl(Register dst, Address src);
void movswl(Register dst, Register src);
-#ifdef _LP64
void movswq(Register dst, Address src);
void movswq(Register dst, Register src);
-#endif
void movups(XMMRegister dst, Address src);
void vmovups(XMMRegister dst, Address src, int vector_len);
@@ -1860,18 +1667,14 @@ class Assembler : public AbstractAssembler {
void movzbl(Register dst, Address src);
void movzbl(Register dst, Register src);
-#ifdef _LP64
void movzbq(Register dst, Address src);
void movzbq(Register dst, Register src);
-#endif
void movzwl(Register dst, Address src);
void movzwl(Register dst, Register src);
-#ifdef _LP64
void movzwq(Register dst, Address src);
void movzwq(Register dst, Register src);
-#endif
// Unsigned multiply with RAX destination register
void mull(Address src);
@@ -1879,13 +1682,11 @@ class Assembler : public AbstractAssembler {
void mull(Register src);
void emull(Register src, bool no_flags);
-#ifdef _LP64
void mulq(Address src);
void emulq(Address src, bool no_flags);
void mulq(Register src);
void emulq(Register src, bool no_flags);
void mulxq(Register dst1, Register dst2, Register src);
-#endif
// Multiply Scalar Double-Precision Floating-Point Values
void mulsd(XMMRegister dst, Address src);
@@ -1900,26 +1701,22 @@ class Assembler : public AbstractAssembler {
void negl(Address dst);
void enegl(Register dst, Address src, bool no_flags);
-#ifdef _LP64
void negq(Register dst);
void enegq(Register dst, Register src, bool no_flags);
void negq(Address dst);
void enegq(Register dst, Address src, bool no_flags);
-#endif
void nop(uint i = 1);
void notl(Register dst);
void enotl(Register dst, Register src);
-#ifdef _LP64
void notq(Register dst);
void enotq(Register dst, Register src);
void btsq(Address dst, int imm8);
void btrq(Address dst, int imm8);
void btq(Register src, int imm8);
-#endif
void btq(Register dst, Register src);
void eorw(Register dst, Register src1, Register src2, bool no_flags);
@@ -2138,14 +1935,8 @@ class Assembler : public AbstractAssembler {
// Multiply add accumulate
void evpdpwssd(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len);
-#ifndef _LP64 // no 32bit push/pop on amd64
- void popl(Address dst);
-#endif
-
-#ifdef _LP64
void popq(Address dst);
void popq(Register dst);
-#endif
void popcntl(Register dst, Address src);
void epopcntl(Register dst, Address src, bool no_flags);
@@ -2157,12 +1948,10 @@ class Assembler : public AbstractAssembler {
void evpopcntd(XMMRegister dst, KRegister mask, XMMRegister src, bool merge, int vector_len);
void evpopcntq(XMMRegister dst, KRegister mask, XMMRegister src, bool merge, int vector_len);
-#ifdef _LP64
void popcntq(Register dst, Address src);
void epopcntq(Register dst, Address src, bool no_flags);
void popcntq(Register dst, Register src);
void epopcntq(Register dst, Register src, bool no_flags);
-#endif
// Prefetches (SSE, SSE2, 3DNOW only)
@@ -2254,10 +2043,6 @@ class Assembler : public AbstractAssembler {
// Vector sum of absolute difference.
void vpsadbw(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len);
-#ifndef _LP64 // no 32bit push/pop on amd64
- void pushl(Address src);
-#endif
-
void pushq(Address src);
void rcll(Register dst, int imm8);
@@ -2289,7 +2074,6 @@ class Assembler : public AbstractAssembler {
void rorl(Register dst, int imm8);
void erorl(Register dst, Register src, int imm8, bool no_flags);
-#ifdef _LP64
void rolq(Register dst);
void erolq(Register dst, Register src, bool no_flags);
void rolq(Register dst, int imm8);
@@ -2302,9 +2086,6 @@ class Assembler : public AbstractAssembler {
void rorxl(Register dst, Address src, int imm8);
void rorxq(Register dst, Register src, int imm8);
void rorxq(Register dst, Address src, int imm8);
-#endif
-
- void sahf();
void sall(Register dst, int imm8);
void esall(Register dst, Register src, int imm8, bool no_flags);
@@ -2324,7 +2105,6 @@ class Assembler : public AbstractAssembler {
void sarl(Register dst);
void esarl(Register dst, Register src, bool no_flags);
-#ifdef _LP64
void salq(Register dst, int imm8);
void esalq(Register dst, Register src, int imm8, bool no_flags);
void salq(Register dst);
@@ -2342,7 +2122,6 @@ class Assembler : public AbstractAssembler {
void esarq(Register dst, Register src, int imm8, bool no_flags);
void sarq(Register dst);
void esarq(Register dst, Register src, bool no_flags);
-#endif
void sbbl(Address dst, int32_t imm32);
void sbbl(Register dst, int32_t imm32);
@@ -2383,12 +2162,10 @@ class Assembler : public AbstractAssembler {
void eshrdl(Register dst, Register src1, Register src2, bool no_flags);
void shrdl(Register dst, Register src, int8_t imm8);
void eshrdl(Register dst, Register src1, Register src2, int8_t imm8, bool no_flags);
-#ifdef _LP64
void shldq(Register dst, Register src, int8_t imm8);
void eshldq(Register dst, Register src1, Register src2, int8_t imm8, bool no_flags);
void shrdq(Register dst, Register src, int8_t imm8);
void eshrdq(Register dst, Register src1, Register src2, int8_t imm8, bool no_flags);
-#endif
void shll(Register dst, int imm8);
void eshll(Register dst, Register src, int imm8, bool no_flags);
@@ -2888,6 +2665,24 @@ class Assembler : public AbstractAssembler {
void evplzcntd(XMMRegister dst, KRegister mask, XMMRegister src, bool merge, int vector_len);
void evplzcntq(XMMRegister dst, KRegister mask, XMMRegister src, bool merge, int vector_len);
+ // Float16 Vector instructions.
+ void evaddph(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len);
+ void evaddph(XMMRegister dst, XMMRegister nds, Address src, int vector_len);
+ void evsubph(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len);
+ void evsubph(XMMRegister dst, XMMRegister nds, Address src, int vector_len);
+ void evdivph(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len);
+ void evdivph(XMMRegister dst, XMMRegister nds, Address src, int vector_len);
+ void evmulph(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len);
+ void evmulph(XMMRegister dst, XMMRegister nds, Address src, int vector_len);
+ void evminph(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len);
+ void evminph(XMMRegister dst, XMMRegister nds, Address src, int vector_len);
+ void evmaxph(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len);
+ void evmaxph(XMMRegister dst, XMMRegister nds, Address src, int vector_len);
+ void evfmadd132ph(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len);
+ void evfmadd132ph(XMMRegister dst, XMMRegister nds, Address src, int vector_len);
+ void evsqrtph(XMMRegister dst, XMMRegister src1, int vector_len);
+ void evsqrtph(XMMRegister dst, Address src1, int vector_len);
+
// Sub packed integers
void psubb(XMMRegister dst, XMMRegister src);
void psubw(XMMRegister dst, XMMRegister src);
@@ -3195,6 +2990,9 @@ class Assembler : public AbstractAssembler {
void evcmpps(KRegister kdst, KRegister mask, XMMRegister nds, XMMRegister src,
ComparisonPredicateFP comparison, int vector_len);
+ void evcmpph(KRegister kdst, KRegister mask, XMMRegister nds, XMMRegister src,
+ ComparisonPredicateFP comparison, int vector_len);
+
void evcmpsh(KRegister kdst, KRegister mask, XMMRegister nds, XMMRegister src,
ComparisonPredicateFP comparison);
diff --git a/src/hotspot/cpu/x86/assembler_x86.inline.hpp b/src/hotspot/cpu/x86/assembler_x86.inline.hpp
index f5cc75a55c5d8..6fae97643060a 100644
--- a/src/hotspot/cpu/x86/assembler_x86.inline.hpp
+++ b/src/hotspot/cpu/x86/assembler_x86.inline.hpp
@@ -25,58 +25,4 @@
#ifndef CPU_X86_ASSEMBLER_X86_INLINE_HPP
#define CPU_X86_ASSEMBLER_X86_INLINE_HPP
-#include "asm/assembler.inline.hpp"
-#include "asm/codeBuffer.hpp"
-#include "code/codeCache.hpp"
-
-#ifndef _LP64
-inline int Assembler::prefix_and_encode(int reg_enc, bool byteinst, bool is_map1)
-{
- int opc_prefix = is_map1 ? 0x0F00 : 0;
- return opc_prefix | reg_enc;
-}
-
-inline int Assembler::prefixq_and_encode(int reg_enc, bool is_map1) {
- int opc_prefix = is_map1 ? 0xF00 : 0;
- return opc_prefix | reg_enc;
-}
-
-inline int Assembler::prefix_and_encode(int dst_enc, bool dst_is_byte, int src_enc, bool src_is_byte, bool is_map1) {
- int opc_prefix = is_map1 ? 0xF00 : 0;
- return opc_prefix | (dst_enc << 3 | src_enc);
-}
-
-inline int Assembler::prefixq_and_encode(int dst_enc, int src_enc, bool is_map1) {
- int opc_prefix = is_map1 ? 0xF00 : 0;
- return opc_prefix | dst_enc << 3 | src_enc;
-}
-
-inline void Assembler::prefix(Register reg) {}
-inline void Assembler::prefix(Register dst, Register src, Prefix p) {}
-inline void Assembler::prefix(Register dst, Address adr, Prefix p) {}
-
-inline void Assembler::prefix(Address adr, bool is_map1) {
- if (is_map1) {
- emit_int8(0x0F);
- }
-}
-
-inline void Assembler::prefixq(Address adr) {}
-
-inline void Assembler::prefix(Address adr, Register reg, bool byteinst, bool is_map1) {
- if (is_map1) {
- emit_int8(0x0F);
- }
-}
-inline void Assembler::prefixq(Address adr, Register reg, bool is_map1) {
- if (is_map1) {
- emit_int8(0x0F);
- }
-}
-
-inline void Assembler::prefix(Address adr, XMMRegister reg) {}
-inline void Assembler::prefixq(Address adr, XMMRegister reg) {}
-
-#endif // _LP64
-
#endif // CPU_X86_ASSEMBLER_X86_INLINE_HPP
diff --git a/src/hotspot/cpu/x86/c1_CodeStubs_x86.cpp b/src/hotspot/cpu/x86/c1_CodeStubs_x86.cpp
index 80365878061d0..7c0d3ff624d6a 100644
--- a/src/hotspot/cpu/x86/c1_CodeStubs_x86.cpp
+++ b/src/hotspot/cpu/x86/c1_CodeStubs_x86.cpp
@@ -37,66 +37,12 @@
#define __ ce->masm()->
-#ifndef _LP64
-float ConversionStub::float_zero = 0.0;
-double ConversionStub::double_zero = 0.0;
-
-void ConversionStub::emit_code(LIR_Assembler* ce) {
- __ bind(_entry);
- assert(bytecode() == Bytecodes::_f2i || bytecode() == Bytecodes::_d2i, "other conversions do not require stub");
-
-
- if (input()->is_single_xmm()) {
- __ comiss(input()->as_xmm_float_reg(),
- ExternalAddress((address)&float_zero));
- } else if (input()->is_double_xmm()) {
- __ comisd(input()->as_xmm_double_reg(),
- ExternalAddress((address)&double_zero));
- } else {
- __ push(rax);
- __ ftst();
- __ fnstsw_ax();
- __ sahf();
- __ pop(rax);
- }
-
- Label NaN, do_return;
- __ jccb(Assembler::parity, NaN);
- __ jccb(Assembler::below, do_return);
-
- // input is > 0 -> return maxInt
- // result register already contains 0x80000000, so subtracting 1 gives 0x7fffffff
- __ decrement(result()->as_register());
- __ jmpb(do_return);
-
- // input is NaN -> return 0
- __ bind(NaN);
- __ xorptr(result()->as_register(), result()->as_register());
-
- __ bind(do_return);
- __ jmp(_continuation);
-}
-#endif // !_LP64
-
void C1SafepointPollStub::emit_code(LIR_Assembler* ce) {
__ bind(_entry);
InternalAddress safepoint_pc(ce->masm()->pc() - ce->masm()->offset() + safepoint_offset());
-#ifdef _LP64
__ lea(rscratch1, safepoint_pc);
__ movptr(Address(r15_thread, JavaThread::saved_exception_pc_offset()), rscratch1);
-#else
- const Register tmp1 = rcx;
- const Register tmp2 = rdx;
- __ push(tmp1);
- __ push(tmp2);
-
- __ lea(tmp1, safepoint_pc);
- __ get_thread(tmp2);
- __ movptr(Address(tmp2, JavaThread::saved_exception_pc_offset()), tmp1);
-
- __ pop(tmp2);
- __ pop(tmp1);
-#endif /* _LP64 */
+
assert(SharedRuntime::polling_page_return_handler_blob() != nullptr,
"polling page return stub not created yet");
@@ -122,7 +68,7 @@ void RangeCheckStub::emit_code(LIR_Assembler* ce) {
__ call(RuntimeAddress(a));
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
return;
}
@@ -142,7 +88,7 @@ void RangeCheckStub::emit_code(LIR_Assembler* ce) {
__ call(RuntimeAddress(Runtime1::entry_for(stub_id)));
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
}
PredicateFailedStub::PredicateFailedStub(CodeEmitInfo* info) {
@@ -155,7 +101,7 @@ void PredicateFailedStub::emit_code(LIR_Assembler* ce) {
__ call(RuntimeAddress(a));
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
}
void DivByZeroStub::emit_code(LIR_Assembler* ce) {
@@ -165,7 +111,7 @@ void DivByZeroStub::emit_code(LIR_Assembler* ce) {
__ bind(_entry);
__ call(RuntimeAddress(Runtime1::entry_for(C1StubId::throw_div0_exception_id)));
ce->add_call_info_here(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
}
@@ -453,7 +399,7 @@ void ImplicitNullCheckStub::emit_code(LIR_Assembler* ce) {
__ call(RuntimeAddress(a));
ce->add_call_info_here(_info);
ce->verify_oop_map(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
}
@@ -467,7 +413,7 @@ void SimpleExceptionStub::emit_code(LIR_Assembler* ce) {
}
__ call(RuntimeAddress(Runtime1::entry_for(_stub)));
ce->add_call_info_here(_info);
- debug_only(__ should_not_reach_here());
+ DEBUG_ONLY(__ should_not_reach_here());
}
diff --git a/src/hotspot/cpu/x86/c1_Defs_x86.hpp b/src/hotspot/cpu/x86/c1_Defs_x86.hpp
index 1637789e79884..bfb885a1b7301 100644
--- a/src/hotspot/cpu/x86/c1_Defs_x86.hpp
+++ b/src/hotspot/cpu/x86/c1_Defs_x86.hpp
@@ -33,15 +33,11 @@ enum {
// registers
enum {
- pd_nof_cpu_regs_frame_map = NOT_LP64(8) LP64_ONLY(16), // number of registers used during code emission
+ pd_nof_cpu_regs_frame_map = 16, // number of registers used during code emission
pd_nof_fpu_regs_frame_map = FloatRegister::number_of_registers, // number of registers used during code emission
pd_nof_xmm_regs_frame_map = XMMRegister::number_of_registers, // number of registers used during code emission
-#ifdef _LP64
#define UNALLOCATED 4 // rsp, rbp, r15, r10
-#else
- #define UNALLOCATED 2 // rsp, rbp
-#endif // LP64
pd_nof_caller_save_cpu_regs_frame_map = pd_nof_cpu_regs_frame_map - UNALLOCATED, // number of registers killed by calls
pd_nof_caller_save_fpu_regs_frame_map = pd_nof_fpu_regs_frame_map, // number of registers killed by calls
@@ -54,9 +50,9 @@ enum {
pd_nof_fpu_regs_linearscan = pd_nof_fpu_regs_frame_map, // number of registers visible to linear scan
pd_nof_xmm_regs_linearscan = pd_nof_xmm_regs_frame_map, // number of registers visible to linear scan
pd_first_cpu_reg = 0,
- pd_last_cpu_reg = NOT_LP64(5) LP64_ONLY(11),
- pd_first_byte_reg = NOT_LP64(2) LP64_ONLY(0),
- pd_last_byte_reg = NOT_LP64(5) LP64_ONLY(11),
+ pd_last_cpu_reg = 11,
+ pd_first_byte_reg = 0,
+ pd_last_byte_reg = 11,
pd_first_fpu_reg = pd_nof_cpu_regs_frame_map,
pd_last_fpu_reg = pd_first_fpu_reg + 7,
pd_first_xmm_reg = pd_nof_cpu_regs_frame_map + pd_nof_fpu_regs_frame_map,
diff --git a/src/hotspot/cpu/x86/c1_FrameMap_x86.cpp b/src/hotspot/cpu/x86/c1_FrameMap_x86.cpp
index e3c7879260266..bdbab432180bd 100644
--- a/src/hotspot/cpu/x86/c1_FrameMap_x86.cpp
+++ b/src/hotspot/cpu/x86/c1_FrameMap_x86.cpp
@@ -32,7 +32,6 @@ const int FrameMap::pd_c_runtime_reserved_arg_size = 0;
LIR_Opr FrameMap::map_to_opr(BasicType type, VMRegPair* reg, bool) {
LIR_Opr opr = LIR_OprFact::illegalOpr;
VMReg r_1 = reg->first();
- VMReg r_2 = reg->second();
if (r_1->is_stack()) {
// Convert stack slot to an SP offset
// The calling convention does not count the SharedRuntime::out_preserve_stack_slots() value
@@ -41,14 +40,8 @@ LIR_Opr FrameMap::map_to_opr(BasicType type, VMRegPair* reg, bool) {
opr = LIR_OprFact::address(new LIR_Address(rsp_opr, st_off, type));
} else if (r_1->is_Register()) {
Register reg = r_1->as_Register();
- if (r_2->is_Register() && (type == T_LONG || type == T_DOUBLE)) {
- Register reg2 = r_2->as_Register();
-#ifdef _LP64
- assert(reg2 == reg, "must be same register");
+ if (type == T_LONG || type == T_DOUBLE) {
opr = as_long_opr(reg);
-#else
- opr = as_long_opr(reg2, reg);
-#endif // _LP64
} else if (is_reference_type(type)) {
opr = as_oop_opr(reg);
} else if (type == T_METADATA) {
@@ -111,8 +104,6 @@ LIR_Opr FrameMap::long1_opr;
LIR_Opr FrameMap::xmm0_float_opr;
LIR_Opr FrameMap::xmm0_double_opr;
-#ifdef _LP64
-
LIR_Opr FrameMap::r8_opr;
LIR_Opr FrameMap::r9_opr;
LIR_Opr FrameMap::r10_opr;
@@ -137,7 +128,6 @@ LIR_Opr FrameMap::r11_metadata_opr;
LIR_Opr FrameMap::r12_metadata_opr;
LIR_Opr FrameMap::r13_metadata_opr;
LIR_Opr FrameMap::r14_metadata_opr;
-#endif // _LP64
LIR_Opr FrameMap::_caller_save_cpu_regs[] = {};
LIR_Opr FrameMap::_caller_save_fpu_regs[] = {};
@@ -157,23 +147,17 @@ XMMRegister FrameMap::nr2xmmreg(int rnr) {
void FrameMap::initialize() {
assert(!_init_done, "once");
- assert(nof_cpu_regs == LP64_ONLY(16) NOT_LP64(8), "wrong number of CPU registers");
+ assert(nof_cpu_regs == 16, "wrong number of CPU registers");
map_register(0, rsi); rsi_opr = LIR_OprFact::single_cpu(0);
map_register(1, rdi); rdi_opr = LIR_OprFact::single_cpu(1);
map_register(2, rbx); rbx_opr = LIR_OprFact::single_cpu(2);
map_register(3, rax); rax_opr = LIR_OprFact::single_cpu(3);
map_register(4, rdx); rdx_opr = LIR_OprFact::single_cpu(4);
map_register(5, rcx); rcx_opr = LIR_OprFact::single_cpu(5);
-
-#ifndef _LP64
- // The unallocatable registers are at the end
- map_register(6, rsp);
- map_register(7, rbp);
-#else
- map_register( 6, r8); r8_opr = LIR_OprFact::single_cpu(6);
- map_register( 7, r9); r9_opr = LIR_OprFact::single_cpu(7);
- map_register( 8, r11); r11_opr = LIR_OprFact::single_cpu(8);
- map_register( 9, r13); r13_opr = LIR_OprFact::single_cpu(9);
+ map_register(6, r8); r8_opr = LIR_OprFact::single_cpu(6);
+ map_register(7, r9); r9_opr = LIR_OprFact::single_cpu(7);
+ map_register(8, r11); r11_opr = LIR_OprFact::single_cpu(8);
+ map_register(9, r13); r13_opr = LIR_OprFact::single_cpu(9);
map_register(10, r14); r14_opr = LIR_OprFact::single_cpu(10);
// r12 is allocated conditionally. With compressed oops it holds
// the heapbase value and is not visible to the allocator.
@@ -183,15 +167,9 @@ void FrameMap::initialize() {
map_register(13, r15); r15_opr = LIR_OprFact::single_cpu(13);
map_register(14, rsp);
map_register(15, rbp);
-#endif // _LP64
-#ifdef _LP64
long0_opr = LIR_OprFact::double_cpu(3 /*eax*/, 3 /*eax*/);
long1_opr = LIR_OprFact::double_cpu(2 /*ebx*/, 2 /*ebx*/);
-#else
- long0_opr = LIR_OprFact::double_cpu(3 /*eax*/, 4 /*edx*/);
- long1_opr = LIR_OprFact::double_cpu(2 /*ebx*/, 5 /*ecx*/);
-#endif // _LP64
xmm0_float_opr = LIR_OprFact::single_xmm(0);
xmm0_double_opr = LIR_OprFact::double_xmm(0);
@@ -201,16 +179,12 @@ void FrameMap::initialize() {
_caller_save_cpu_regs[3] = rax_opr;
_caller_save_cpu_regs[4] = rdx_opr;
_caller_save_cpu_regs[5] = rcx_opr;
-
-#ifdef _LP64
_caller_save_cpu_regs[6] = r8_opr;
_caller_save_cpu_regs[7] = r9_opr;
_caller_save_cpu_regs[8] = r11_opr;
_caller_save_cpu_regs[9] = r13_opr;
_caller_save_cpu_regs[10] = r14_opr;
_caller_save_cpu_regs[11] = r12_opr;
-#endif // _LP64
-
_xmm_regs[0] = xmm0;
_xmm_regs[1] = xmm1;
@@ -220,8 +194,6 @@ void FrameMap::initialize() {
_xmm_regs[5] = xmm5;
_xmm_regs[6] = xmm6;
_xmm_regs[7] = xmm7;
-
-#ifdef _LP64
_xmm_regs[8] = xmm8;
_xmm_regs[9] = xmm9;
_xmm_regs[10] = xmm10;
@@ -246,7 +218,6 @@ void FrameMap::initialize() {
_xmm_regs[29] = xmm29;
_xmm_regs[30] = xmm30;
_xmm_regs[31] = xmm31;
-#endif // _LP64
for (int i = 0; i < 8; i++) {
_caller_save_fpu_regs[i] = LIR_OprFact::single_fpu(i);
@@ -276,7 +247,6 @@ void FrameMap::initialize() {
rsp_opr = as_pointer_opr(rsp);
rbp_opr = as_pointer_opr(rbp);
-#ifdef _LP64
r8_oop_opr = as_oop_opr(r8);
r9_oop_opr = as_oop_opr(r9);
r11_oop_opr = as_oop_opr(r11);
@@ -290,7 +260,6 @@ void FrameMap::initialize() {
r12_metadata_opr = as_metadata_opr(r12);
r13_metadata_opr = as_metadata_opr(r13);
r14_metadata_opr = as_metadata_opr(r14);
-#endif // _LP64
VMRegPair regs;
BasicType sig_bt = T_OBJECT;
diff --git a/src/hotspot/cpu/x86/c1_FrameMap_x86.hpp b/src/hotspot/cpu/x86/c1_FrameMap_x86.hpp
index ce892efbed243..08b872cb0951d 100644
--- a/src/hotspot/cpu/x86/c1_FrameMap_x86.hpp
+++ b/src/hotspot/cpu/x86/c1_FrameMap_x86.hpp
@@ -41,13 +41,8 @@
nof_xmm_regs = pd_nof_xmm_regs_frame_map,
nof_caller_save_xmm_regs = pd_nof_caller_save_xmm_regs_frame_map,
first_available_sp_in_frame = 0,
-#ifndef _LP64
- frame_pad_in_bytes = 8,
- nof_reg_args = 2
-#else
frame_pad_in_bytes = 16,
nof_reg_args = 6
-#endif // _LP64
};
private:
@@ -81,8 +76,6 @@
static LIR_Opr rdx_metadata_opr;
static LIR_Opr rcx_metadata_opr;
-#ifdef _LP64
-
static LIR_Opr r8_opr;
static LIR_Opr r9_opr;
static LIR_Opr r10_opr;
@@ -108,28 +101,17 @@
static LIR_Opr r13_metadata_opr;
static LIR_Opr r14_metadata_opr;
-#endif // _LP64
-
static LIR_Opr long0_opr;
static LIR_Opr long1_opr;
static LIR_Opr xmm0_float_opr;
static LIR_Opr xmm0_double_opr;
-#ifdef _LP64
static LIR_Opr as_long_opr(Register r) {
return LIR_OprFact::double_cpu(cpu_reg2rnr(r), cpu_reg2rnr(r));
}
static LIR_Opr as_pointer_opr(Register r) {
return LIR_OprFact::double_cpu(cpu_reg2rnr(r), cpu_reg2rnr(r));
}
-#else
- static LIR_Opr as_long_opr(Register r, Register r2) {
- return LIR_OprFact::double_cpu(cpu_reg2rnr(r), cpu_reg2rnr(r2));
- }
- static LIR_Opr as_pointer_opr(Register r) {
- return LIR_OprFact::single_cpu(cpu_reg2rnr(r));
- }
-#endif // _LP64
// VMReg name for spilled physical FPU stack slot n
static VMReg fpu_regname (int n);
diff --git a/src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp b/src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp
index ed16f81cba18a..04e32e2b8be8c 100644
--- a/src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp
+++ b/src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp
@@ -169,7 +169,6 @@ void LIR_Assembler::push(LIR_Opr opr) {
if (opr->is_single_cpu()) {
__ push_reg(opr->as_register());
} else if (opr->is_double_cpu()) {
- NOT_LP64(__ push_reg(opr->as_register_hi()));
__ push_reg(opr->as_register_lo());
} else if (opr->is_stack()) {
__ push_addr(frame_map()->address_for_slot(opr->single_stack_ix()));
@@ -325,11 +324,9 @@ void LIR_Assembler::clinit_barrier(ciMethod* method) {
Label L_skip_barrier;
Register klass = rscratch1;
- Register thread = LP64_ONLY( r15_thread ) NOT_LP64( noreg );
- assert(thread != noreg, "x86_32 not implemented");
__ mov_metadata(klass, method->holder()->constant_encoding());
- __ clinit_barrier(klass, thread, &L_skip_barrier /*L_fast_path*/);
+ __ clinit_barrier(klass, &L_skip_barrier /*L_fast_path*/);
__ jump(RuntimeAddress(SharedRuntime::get_handle_wrong_method_stub()));
@@ -401,11 +398,9 @@ int LIR_Assembler::emit_unwind_handler() {
int offset = code_offset();
// Fetch the exception from TLS and clear out exception related thread state
- Register thread = NOT_LP64(rsi) LP64_ONLY(r15_thread);
- NOT_LP64(__ get_thread(thread));
- __ movptr(rax, Address(thread, JavaThread::exception_oop_offset()));
- __ movptr(Address(thread, JavaThread::exception_oop_offset()), NULL_WORD);
- __ movptr(Address(thread, JavaThread::exception_pc_offset()), NULL_WORD);
+ __ movptr(rax, Address(r15_thread, JavaThread::exception_oop_offset()));
+ __ movptr(Address(r15_thread, JavaThread::exception_oop_offset()), NULL_WORD);
+ __ movptr(Address(r15_thread, JavaThread::exception_pc_offset()), NULL_WORD);
__ bind(_unwind_handler_entry);
__ verify_not_null_oop(rax);
@@ -427,14 +422,8 @@ int LIR_Assembler::emit_unwind_handler() {
}
if (compilation()->env()->dtrace_method_probes()) {
-#ifdef _LP64
__ mov(rdi, r15_thread);
__ mov_metadata(rsi, method()->constant_encoding());
-#else
- __ get_thread(rax);
- __ movptr(Address(rsp, 0), rax);
- __ mov_metadata(Address(rsp, sizeof(void*)), method()->constant_encoding(), noreg);
-#endif
__ call(RuntimeAddress(CAST_FROM_FN_PTR(address, SharedRuntime::dtrace_method_exit)));
}
@@ -491,15 +480,9 @@ void LIR_Assembler::return_op(LIR_Opr result, C1SafepointPollStub* code_stub) {
// Note: we do not need to round double result; float result has the right precision
// the poll sets the condition code, but no data registers
-#ifdef _LP64
- const Register thread = r15_thread;
-#else
- const Register thread = rbx;
- __ get_thread(thread);
-#endif
code_stub->set_safepoint_offset(__ offset());
__ relocate(relocInfo::poll_return_type);
- __ safepoint_poll(*code_stub->entry(), thread, true /* at_return */, true /* in_nmethod */);
+ __ safepoint_poll(*code_stub->entry(), true /* at_return */, true /* in_nmethod */);
__ ret(0);
}
@@ -507,21 +490,14 @@ void LIR_Assembler::return_op(LIR_Opr result, C1SafepointPollStub* code_stub) {
int LIR_Assembler::safepoint_poll(LIR_Opr tmp, CodeEmitInfo* info) {
guarantee(info != nullptr, "Shouldn't be null");
int offset = __ offset();
-#ifdef _LP64
const Register poll_addr = rscratch1;
__ movptr(poll_addr, Address(r15_thread, JavaThread::polling_page_offset()));
-#else
- assert(tmp->is_cpu_register(), "needed");
- const Register poll_addr = tmp->as_register();
- __ get_thread(poll_addr);
- __ movptr(poll_addr, Address(poll_addr, in_bytes(JavaThread::polling_page_offset())));
-#endif
add_debug_info_for_branch(info);
__ relocate(relocInfo::poll_type);
address pre_pc = __ pc();
__ testl(rax, Address(poll_addr, 0));
address post_pc = __ pc();
- guarantee(pointer_delta(post_pc, pre_pc, 1) == 2 LP64_ONLY(+1), "must be exact length");
+ guarantee(pointer_delta(post_pc, pre_pc, 1) == 3, "must be exact length");
return offset;
}
@@ -555,12 +531,7 @@ void LIR_Assembler::const2reg(LIR_Opr src, LIR_Opr dest, LIR_PatchCode patch_cod
case T_LONG: {
assert(patch_code == lir_patch_none, "no patching handled here");
-#ifdef _LP64
__ movptr(dest->as_register_lo(), (intptr_t)c->as_jlong());
-#else
- __ movptr(dest->as_register_lo(), c->as_jint_lo());
- __ movptr(dest->as_register_hi(), c->as_jint_hi());
-#endif // _LP64
break;
}
@@ -636,17 +607,10 @@ void LIR_Assembler::const2stack(LIR_Opr src, LIR_Opr dest) {
case T_LONG: // fall through
case T_DOUBLE:
-#ifdef _LP64
__ movptr(frame_map()->address_for_slot(dest->double_stack_ix(),
lo_word_offset_in_bytes),
(intptr_t)c->as_jlong_bits(),
rscratch1);
-#else
- __ movptr(frame_map()->address_for_slot(dest->double_stack_ix(),
- lo_word_offset_in_bytes), c->as_jint_lo_bits());
- __ movptr(frame_map()->address_for_slot(dest->double_stack_ix(),
- hi_word_offset_in_bytes), c->as_jint_hi_bits());
-#endif // _LP64
break;
default:
@@ -677,20 +641,15 @@ void LIR_Assembler::const2mem(LIR_Opr src, LIR_Opr dest, BasicType type, CodeEmi
if (UseCompressedOops && !wide) {
__ movl(as_Address(addr), NULL_WORD);
} else {
-#ifdef _LP64
__ xorptr(rscratch1, rscratch1);
null_check_here = code_offset();
__ movptr(as_Address(addr), rscratch1);
-#else
- __ movptr(as_Address(addr), NULL_WORD);
-#endif
}
} else {
if (is_literal_address(addr)) {
ShouldNotReachHere();
__ movoop(as_Address(addr, noreg), c->as_jobject(), rscratch1);
} else {
-#ifdef _LP64
__ movoop(rscratch1, c->as_jobject());
if (UseCompressedOops && !wide) {
__ encode_heap_oop(rscratch1);
@@ -700,16 +659,12 @@ void LIR_Assembler::const2mem(LIR_Opr src, LIR_Opr dest, BasicType type, CodeEmi
null_check_here = code_offset();
__ movptr(as_Address_lo(addr), rscratch1);
}
-#else
- __ movoop(as_Address(addr), c->as_jobject(), noreg);
-#endif
}
}
break;
case T_LONG: // fall through
case T_DOUBLE:
-#ifdef _LP64
if (is_literal_address(addr)) {
ShouldNotReachHere();
__ movptr(as_Address(addr, r15_thread), (intptr_t)c->as_jlong_bits());
@@ -718,11 +673,6 @@ void LIR_Assembler::const2mem(LIR_Opr src, LIR_Opr dest, BasicType type, CodeEmi
null_check_here = code_offset();
__ movptr(as_Address_lo(addr), r10);
}
-#else
- // Always reachable in 32bit so this doesn't produce useless move literal
- __ movptr(as_Address_hi(addr), c->as_jint_hi_bits());
- __ movptr(as_Address_lo(addr), c->as_jint_lo_bits());
-#endif // _LP64
break;
case T_BOOLEAN: // fall through
@@ -751,13 +701,11 @@ void LIR_Assembler::reg2reg(LIR_Opr src, LIR_Opr dest) {
// move between cpu-registers
if (dest->is_single_cpu()) {
-#ifdef _LP64
if (src->type() == T_LONG) {
// Can do LONG -> OBJECT
move_regs(src->as_register_lo(), dest->as_register());
return;
}
-#endif
assert(src->is_single_cpu(), "must match");
if (src->type() == T_OBJECT) {
__ verify_oop(src->as_register());
@@ -765,39 +713,20 @@ void LIR_Assembler::reg2reg(LIR_Opr src, LIR_Opr dest) {
move_regs(src->as_register(), dest->as_register());
} else if (dest->is_double_cpu()) {
-#ifdef _LP64
if (is_reference_type(src->type())) {
// Surprising to me but we can see move of a long to t_object
__ verify_oop(src->as_register());
move_regs(src->as_register(), dest->as_register_lo());
return;
}
-#endif
assert(src->is_double_cpu(), "must match");
Register f_lo = src->as_register_lo();
Register f_hi = src->as_register_hi();
Register t_lo = dest->as_register_lo();
Register t_hi = dest->as_register_hi();
-#ifdef _LP64
assert(f_hi == f_lo, "must be same");
assert(t_hi == t_lo, "must be same");
move_regs(f_lo, t_lo);
-#else
- assert(f_lo != f_hi && t_lo != t_hi, "invalid register allocation");
-
-
- if (f_lo == t_hi && f_hi == t_lo) {
- swap_reg(f_lo, f_hi);
- } else if (f_hi == t_lo) {
- assert(f_lo != t_hi, "overwriting register");
- move_regs(f_hi, t_hi);
- move_regs(f_lo, t_lo);
- } else {
- assert(f_hi != t_lo, "overwriting register");
- move_regs(f_lo, t_lo);
- move_regs(f_hi, t_hi);
- }
-#endif // LP64
// move between xmm-registers
} else if (dest->is_single_xmm()) {
@@ -831,7 +760,6 @@ void LIR_Assembler::reg2stack(LIR_Opr src, LIR_Opr dest, BasicType type) {
Address dstLO = frame_map()->address_for_slot(dest->double_stack_ix(), lo_word_offset_in_bytes);
Address dstHI = frame_map()->address_for_slot(dest->double_stack_ix(), hi_word_offset_in_bytes);
__ movptr (dstLO, src->as_register_lo());
- NOT_LP64(__ movptr (dstHI, src->as_register_hi()));
} else if (src->is_single_xmm()) {
Address dst_addr = frame_map()->address_for_slot(dest->single_stack_ix());
@@ -854,7 +782,6 @@ void LIR_Assembler::reg2mem(LIR_Opr src, LIR_Opr dest, BasicType type, LIR_Patch
if (is_reference_type(type)) {
__ verify_oop(src->as_register());
-#ifdef _LP64
if (UseCompressedOops && !wide) {
__ movptr(compressed_src, src->as_register());
__ encode_heap_oop(compressed_src);
@@ -862,7 +789,6 @@ void LIR_Assembler::reg2mem(LIR_Opr src, LIR_Opr dest, BasicType type, LIR_Patch
info->oop_map()->set_narrowoop(compressed_src->as_VMReg());
}
}
-#endif
}
if (patch_code != lir_patch_none) {
@@ -893,14 +819,6 @@ void LIR_Assembler::reg2mem(LIR_Opr src, LIR_Opr dest, BasicType type, LIR_Patch
__ movptr(as_Address(to_addr), src->as_register());
}
break;
- case T_METADATA:
- // We get here to store a method pointer to the stack to pass to
- // a dtrace runtime call. This can't work on 64 bit with
- // compressed klass ptrs: T_METADATA can be a compressed klass
- // ptr or a 64 bit method pointer.
- LP64_ONLY(ShouldNotReachHere());
- __ movptr(as_Address(to_addr), src->as_register());
- break;
case T_ADDRESS:
__ movptr(as_Address(to_addr), src->as_register());
break;
@@ -911,35 +829,7 @@ void LIR_Assembler::reg2mem(LIR_Opr src, LIR_Opr dest, BasicType type, LIR_Patch
case T_LONG: {
Register from_lo = src->as_register_lo();
Register from_hi = src->as_register_hi();
-#ifdef _LP64
__ movptr(as_Address_lo(to_addr), from_lo);
-#else
- Register base = to_addr->base()->as_register();
- Register index = noreg;
- if (to_addr->index()->is_register()) {
- index = to_addr->index()->as_register();
- }
- if (base == from_lo || index == from_lo) {
- assert(base != from_hi, "can't be");
- assert(index == noreg || (index != base && index != from_hi), "can't handle this");
- __ movl(as_Address_hi(to_addr), from_hi);
- if (patch != nullptr) {
- patching_epilog(patch, lir_patch_high, base, info);
- patch = new PatchingStub(_masm, PatchingStub::access_field_id);
- patch_code = lir_patch_low;
- }
- __ movl(as_Address_lo(to_addr), from_lo);
- } else {
- assert(index == noreg || (index != base && index != from_lo), "can't handle this");
- __ movl(as_Address_lo(to_addr), from_lo);
- if (patch != nullptr) {
- patching_epilog(patch, lir_patch_low, base, info);
- patch = new PatchingStub(_masm, PatchingStub::access_field_id);
- patch_code = lir_patch_high;
- }
- __ movl(as_Address_hi(to_addr), from_hi);
- }
-#endif // _LP64
break;
}
@@ -988,7 +878,6 @@ void LIR_Assembler::stack2reg(LIR_Opr src, LIR_Opr dest, BasicType type) {
Address src_addr_LO = frame_map()->address_for_slot(src->double_stack_ix(), lo_word_offset_in_bytes);
Address src_addr_HI = frame_map()->address_for_slot(src->double_stack_ix(), hi_word_offset_in_bytes);
__ movptr(dest->as_register_lo(), src_addr_LO);
- NOT_LP64(__ movptr(dest->as_register_hi(), src_addr_HI));
} else if (dest->is_single_xmm()) {
Address src_addr = frame_map()->address_for_slot(src->single_stack_ix());
@@ -1010,27 +899,14 @@ void LIR_Assembler::stack2stack(LIR_Opr src, LIR_Opr dest, BasicType type) {
__ pushptr(frame_map()->address_for_slot(src ->single_stack_ix()));
__ popptr (frame_map()->address_for_slot(dest->single_stack_ix()));
} else {
-#ifndef _LP64
- __ pushl(frame_map()->address_for_slot(src ->single_stack_ix()));
- __ popl (frame_map()->address_for_slot(dest->single_stack_ix()));
-#else
//no pushl on 64bits
__ movl(rscratch1, frame_map()->address_for_slot(src ->single_stack_ix()));
__ movl(frame_map()->address_for_slot(dest->single_stack_ix()), rscratch1);
-#endif
}
} else if (src->is_double_stack()) {
-#ifdef _LP64
__ pushptr(frame_map()->address_for_slot(src ->double_stack_ix()));
__ popptr (frame_map()->address_for_slot(dest->double_stack_ix()));
-#else
- __ pushl(frame_map()->address_for_slot(src ->double_stack_ix(), 0));
- // push and pop the part at src + wordSize, adding wordSize for the previous push
- __ pushl(frame_map()->address_for_slot(src ->double_stack_ix(), 2 * wordSize));
- __ popl (frame_map()->address_for_slot(dest->double_stack_ix(), 2 * wordSize));
- __ popl (frame_map()->address_for_slot(dest->double_stack_ix(), 0));
-#endif // _LP64
} else {
ShouldNotReachHere();
@@ -1113,44 +989,7 @@ void LIR_Assembler::mem2reg(LIR_Opr src, LIR_Opr dest, BasicType type, LIR_Patch
case T_LONG: {
Register to_lo = dest->as_register_lo();
Register to_hi = dest->as_register_hi();
-#ifdef _LP64
__ movptr(to_lo, as_Address_lo(addr));
-#else
- Register base = addr->base()->as_register();
- Register index = noreg;
- if (addr->index()->is_register()) {
- index = addr->index()->as_register();
- }
- if ((base == to_lo && index == to_hi) ||
- (base == to_hi && index == to_lo)) {
- // addresses with 2 registers are only formed as a result of
- // array access so this code will never have to deal with
- // patches or null checks.
- assert(info == nullptr && patch == nullptr, "must be");
- __ lea(to_hi, as_Address(addr));
- __ movl(to_lo, Address(to_hi, 0));
- __ movl(to_hi, Address(to_hi, BytesPerWord));
- } else if (base == to_lo || index == to_lo) {
- assert(base != to_hi, "can't be");
- assert(index == noreg || (index != base && index != to_hi), "can't handle this");
- __ movl(to_hi, as_Address_hi(addr));
- if (patch != nullptr) {
- patching_epilog(patch, lir_patch_high, base, info);
- patch = new PatchingStub(_masm, PatchingStub::access_field_id);
- patch_code = lir_patch_low;
- }
- __ movl(to_lo, as_Address_lo(addr));
- } else {
- assert(index == noreg || (index != base && index != to_lo), "can't handle this");
- __ movl(to_lo, as_Address_lo(addr));
- if (patch != nullptr) {
- patching_epilog(patch, lir_patch_low, base, info);
- patch = new PatchingStub(_masm, PatchingStub::access_field_id);
- patch_code = lir_patch_high;
- }
- __ movl(to_hi, as_Address_hi(addr));
- }
-#endif // _LP64
break;
}
@@ -1200,11 +1039,9 @@ void LIR_Assembler::mem2reg(LIR_Opr src, LIR_Opr dest, BasicType type, LIR_Patch
}
if (is_reference_type(type)) {
-#ifdef _LP64
if (UseCompressedOops && !wide) {
__ decode_heap_oop(dest->as_register());
}
-#endif
__ verify_oop(dest->as_register());
}
@@ -1299,21 +1136,11 @@ void LIR_Assembler::emit_opConvert(LIR_OpConvert* op) {
switch (op->bytecode()) {
case Bytecodes::_i2l:
-#ifdef _LP64
__ movl2ptr(dest->as_register_lo(), src->as_register());
-#else
- move_regs(src->as_register(), dest->as_register_lo());
- move_regs(src->as_register(), dest->as_register_hi());
- __ sarl(dest->as_register_hi(), 31);
-#endif // LP64
break;
case Bytecodes::_l2i:
-#ifdef _LP64
__ movl(dest->as_register(), src->as_register_lo());
-#else
- move_regs(src->as_register_lo(), dest->as_register());
-#endif
break;
case Bytecodes::_i2b:
@@ -1396,7 +1223,7 @@ void LIR_Assembler::emit_alloc_obj(LIR_OpAllocObj* op) {
void LIR_Assembler::emit_alloc_array(LIR_OpAllocArray* op) {
Register len = op->len()->as_register();
- LP64_ONLY( __ movslq(len, len); )
+ __ movslq(len, len);
if (UseSlowPath ||
(!UseFastNewObjectArray && is_reference_type(op->type())) ||
@@ -1464,7 +1291,7 @@ void LIR_Assembler::emit_typecheck_helper(LIR_OpTypeCheck *op, Label* success, L
Register dst = op->result_opr()->as_register();
ciKlass* k = op->klass();
Register Rtmp1 = noreg;
- Register tmp_load_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg);
+ Register tmp_load_klass = rscratch1;
// check if it needs to be profiled
ciMethodData* md = nullptr;
@@ -1526,29 +1353,19 @@ void LIR_Assembler::emit_typecheck_helper(LIR_OpTypeCheck *op, Label* success, L
if (!k->is_loaded()) {
klass2reg_with_patching(k_RInfo, op->info_for_patch());
} else {
-#ifdef _LP64
__ mov_metadata(k_RInfo, k->constant_encoding());
-#endif // _LP64
}
__ verify_oop(obj);
if (op->fast_check()) {
// get object class
// not a safepoint as obj null check happens earlier
-#ifdef _LP64
if (UseCompressedClassPointers) {
__ load_klass(Rtmp1, obj, tmp_load_klass);
__ cmpptr(k_RInfo, Rtmp1);
} else {
__ cmpptr(k_RInfo, Address(obj, oopDesc::klass_offset_in_bytes()));
}
-#else
- if (k->is_loaded()) {
- __ cmpklass(Address(obj, oopDesc::klass_offset_in_bytes()), k->constant_encoding());
- } else {
- __ cmpptr(k_RInfo, Address(obj, oopDesc::klass_offset_in_bytes()));
- }
-#endif
__ jcc(Assembler::notEqual, *failure_target);
// successful cast, fall through to profile or jump
} else {
@@ -1557,11 +1374,7 @@ void LIR_Assembler::emit_typecheck_helper(LIR_OpTypeCheck *op, Label* success, L
__ load_klass(klass_RInfo, obj, tmp_load_klass);
if (k->is_loaded()) {
// See if we get an immediate positive hit
-#ifdef _LP64
__ cmpptr(k_RInfo, Address(klass_RInfo, k->super_check_offset()));
-#else
- __ cmpklass(Address(klass_RInfo, k->super_check_offset()), k->constant_encoding());
-#endif // _LP64
if ((juint)in_bytes(Klass::secondary_super_cache_offset()) != k->super_check_offset()) {
__ jcc(Assembler::notEqual, *failure_target);
// successful cast, fall through to profile or jump
@@ -1569,19 +1382,11 @@ void LIR_Assembler::emit_typecheck_helper(LIR_OpTypeCheck *op, Label* success, L
// See if we get an immediate positive hit
__ jcc(Assembler::equal, *success_target);
// check for self
-#ifdef _LP64
__ cmpptr(klass_RInfo, k_RInfo);
-#else
- __ cmpklass(klass_RInfo, k->constant_encoding());
-#endif // _LP64
__ jcc(Assembler::equal, *success_target);
__ push(klass_RInfo);
-#ifdef _LP64
__ push(k_RInfo);
-#else
- __ pushklass(k->constant_encoding(), noreg);
-#endif // _LP64
__ call(RuntimeAddress(Runtime1::entry_for(C1StubId::slow_subtype_check_id)));
__ pop(klass_RInfo);
__ pop(klass_RInfo);
@@ -1610,7 +1415,7 @@ void LIR_Assembler::emit_typecheck_helper(LIR_OpTypeCheck *op, Label* success, L
void LIR_Assembler::emit_opTypeCheck(LIR_OpTypeCheck* op) {
- Register tmp_load_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg);
+ Register tmp_load_klass = rscratch1;
LIR_Code code = op->code();
if (code == lir_store_check) {
Register value = op->object()->as_register();
@@ -1714,17 +1519,7 @@ void LIR_Assembler::emit_opTypeCheck(LIR_OpTypeCheck* op) {
void LIR_Assembler::emit_compare_and_swap(LIR_OpCompareAndSwap* op) {
- if (LP64_ONLY(false &&) op->code() == lir_cas_long) {
- assert(op->cmp_value()->as_register_lo() == rax, "wrong register");
- assert(op->cmp_value()->as_register_hi() == rdx, "wrong register");
- assert(op->new_value()->as_register_lo() == rbx, "wrong register");
- assert(op->new_value()->as_register_hi() == rcx, "wrong register");
- Register addr = op->addr()->as_register();
- __ lock();
- NOT_LP64(__ cmpxchg8(Address(addr, 0)));
-
- } else if (op->code() == lir_cas_int || op->code() == lir_cas_obj ) {
- NOT_LP64(assert(op->addr()->is_single_cpu(), "must be single");)
+ if (op->code() == lir_cas_int || op->code() == lir_cas_obj) {
Register addr = (op->addr()->is_single_cpu() ? op->addr()->as_register() : op->addr()->as_register_lo());
Register newval = op->new_value()->as_register();
Register cmpval = op->cmp_value()->as_register();
@@ -1734,8 +1529,7 @@ void LIR_Assembler::emit_compare_and_swap(LIR_OpCompareAndSwap* op) {
assert(cmpval != addr, "cmp and addr must be in different registers");
assert(newval != addr, "new value and addr must be in different registers");
- if ( op->code() == lir_cas_obj) {
-#ifdef _LP64
+ if (op->code() == lir_cas_obj) {
if (UseCompressedOops) {
__ encode_heap_oop(cmpval);
__ mov(rscratch1, newval);
@@ -1743,9 +1537,7 @@ void LIR_Assembler::emit_compare_and_swap(LIR_OpCompareAndSwap* op) {
__ lock();
// cmpval (rax) is implicitly used by this instruction
__ cmpxchgl(rscratch1, Address(addr, 0));
- } else
-#endif
- {
+ } else {
__ lock();
__ cmpxchgptr(newval, Address(addr, 0));
}
@@ -1754,7 +1546,6 @@ void LIR_Assembler::emit_compare_and_swap(LIR_OpCompareAndSwap* op) {
__ lock();
__ cmpxchgl(newval, Address(addr, 0));
}
-#ifdef _LP64
} else if (op->code() == lir_cas_long) {
Register addr = (op->addr()->is_single_cpu() ? op->addr()->as_register() : op->addr()->as_register_lo());
Register newval = op->new_value()->as_register_lo();
@@ -1766,7 +1557,6 @@ void LIR_Assembler::emit_compare_and_swap(LIR_OpCompareAndSwap* op) {
assert(newval != addr, "new value and addr must be in different registers");
__ lock();
__ cmpxchgq(newval, Address(addr, 0));
-#endif // _LP64
} else {
Unimplemented();
}
@@ -1809,12 +1599,10 @@ void LIR_Assembler::cmove(LIR_Condition condition, LIR_Opr opr1, LIR_Opr opr2, L
assert(opr2->cpu_regnrLo() != result->cpu_regnrLo() && opr2->cpu_regnrLo() != result->cpu_regnrHi(), "opr2 already overwritten by previous move");
assert(opr2->cpu_regnrHi() != result->cpu_regnrLo() && opr2->cpu_regnrHi() != result->cpu_regnrHi(), "opr2 already overwritten by previous move");
__ cmovptr(ncond, result->as_register_lo(), opr2->as_register_lo());
- NOT_LP64(__ cmovptr(ncond, result->as_register_hi(), opr2->as_register_hi());)
} else if (opr2->is_single_stack()) {
__ cmovl(ncond, result->as_register(), frame_map()->address_for_slot(opr2->single_stack_ix()));
} else if (opr2->is_double_stack()) {
__ cmovptr(ncond, result->as_register_lo(), frame_map()->address_for_slot(opr2->double_stack_ix(), lo_word_offset_in_bytes));
- NOT_LP64(__ cmovptr(ncond, result->as_register_hi(), frame_map()->address_for_slot(opr2->double_stack_ix(), hi_word_offset_in_bytes));)
} else {
ShouldNotReachHere();
}
@@ -1890,28 +1678,16 @@ void LIR_Assembler::arith_op(LIR_Code code, LIR_Opr left, LIR_Opr right, LIR_Opr
// cpu register - cpu register
Register rreg_lo = right->as_register_lo();
Register rreg_hi = right->as_register_hi();
- NOT_LP64(assert_different_registers(lreg_lo, lreg_hi, rreg_lo, rreg_hi));
- LP64_ONLY(assert_different_registers(lreg_lo, rreg_lo));
+ assert_different_registers(lreg_lo, rreg_lo);
switch (code) {
case lir_add:
__ addptr(lreg_lo, rreg_lo);
- NOT_LP64(__ adcl(lreg_hi, rreg_hi));
break;
case lir_sub:
__ subptr(lreg_lo, rreg_lo);
- NOT_LP64(__ sbbl(lreg_hi, rreg_hi));
break;
case lir_mul:
-#ifdef _LP64
__ imulq(lreg_lo, rreg_lo);
-#else
- assert(lreg_lo == rax && lreg_hi == rdx, "must be");
- __ imull(lreg_hi, rreg_lo);
- __ imull(rreg_hi, lreg_lo);
- __ addl (rreg_hi, lreg_hi);
- __ mull (rreg_lo);
- __ addl (lreg_hi, rreg_hi);
-#endif // _LP64
break;
default:
ShouldNotReachHere();
@@ -1919,7 +1695,6 @@ void LIR_Assembler::arith_op(LIR_Code code, LIR_Opr left, LIR_Opr right, LIR_Opr
} else if (right->is_constant()) {
// cpu register - constant
-#ifdef _LP64
jlong c = right->as_constant_ptr()->as_jlong_bits();
__ movptr(r10, (intptr_t) c);
switch (code) {
@@ -1932,22 +1707,6 @@ void LIR_Assembler::arith_op(LIR_Code code, LIR_Opr left, LIR_Opr right, LIR_Opr
default:
ShouldNotReachHere();
}
-#else
- jint c_lo = right->as_constant_ptr()->as_jint_lo();
- jint c_hi = right->as_constant_ptr()->as_jint_hi();
- switch (code) {
- case lir_add:
- __ addptr(lreg_lo, c_lo);
- __ adcl(lreg_hi, c_hi);
- break;
- case lir_sub:
- __ subptr(lreg_lo, c_lo);
- __ sbbl(lreg_hi, c_hi);
- break;
- default:
- ShouldNotReachHere();
- }
-#endif // _LP64
} else {
ShouldNotReachHere();
@@ -2123,7 +1882,6 @@ void LIR_Assembler::logic_op(LIR_Code code, LIR_Opr left, LIR_Opr right, LIR_Opr
Register l_lo = left->as_register_lo();
Register l_hi = left->as_register_hi();
if (right->is_constant()) {
-#ifdef _LP64
__ mov64(rscratch1, right->as_constant_ptr()->as_jlong());
switch (code) {
case lir_logic_and:
@@ -2137,50 +1895,22 @@ void LIR_Assembler::logic_op(LIR_Code code, LIR_Opr left, LIR_Opr right, LIR_Opr
break;
default: ShouldNotReachHere();
}
-#else
- int r_lo = right->as_constant_ptr()->as_jint_lo();
- int r_hi = right->as_constant_ptr()->as_jint_hi();
- switch (code) {
- case lir_logic_and:
- __ andl(l_lo, r_lo);
- __ andl(l_hi, r_hi);
- break;
- case lir_logic_or:
- __ orl(l_lo, r_lo);
- __ orl(l_hi, r_hi);
- break;
- case lir_logic_xor:
- __ xorl(l_lo, r_lo);
- __ xorl(l_hi, r_hi);
- break;
- default: ShouldNotReachHere();
- }
-#endif // _LP64
} else {
-#ifdef _LP64
Register r_lo;
if (is_reference_type(right->type())) {
r_lo = right->as_register();
} else {
r_lo = right->as_register_lo();
}
-#else
- Register r_lo = right->as_register_lo();
- Register r_hi = right->as_register_hi();
- assert(l_lo != r_hi, "overwriting registers");
-#endif
switch (code) {
case lir_logic_and:
__ andptr(l_lo, r_lo);
- NOT_LP64(__ andptr(l_hi, r_hi);)
break;
case lir_logic_or:
__ orptr(l_lo, r_lo);
- NOT_LP64(__ orptr(l_hi, r_hi);)
break;
case lir_logic_xor:
__ xorptr(l_lo, r_lo);
- NOT_LP64(__ xorptr(l_hi, r_hi);)
break;
default: ShouldNotReachHere();
}
@@ -2189,19 +1919,7 @@ void LIR_Assembler::logic_op(LIR_Code code, LIR_Opr left, LIR_Opr right, LIR_Opr
Register dst_lo = dst->as_register_lo();
Register dst_hi = dst->as_register_hi();
-#ifdef _LP64
move_regs(l_lo, dst_lo);
-#else
- if (dst_lo == l_hi) {
- assert(dst_hi != l_lo, "overwriting registers");
- move_regs(l_hi, dst_hi);
- move_regs(l_lo, dst_lo);
- } else {
- assert(dst_lo != l_hi, "overwriting registers");
- move_regs(l_lo, dst_lo);
- move_regs(l_hi, dst_hi);
- }
-#endif // _LP64
}
}
@@ -2329,27 +2047,11 @@ void LIR_Assembler::comp_op(LIR_Condition condition, LIR_Opr opr1, LIR_Opr opr2,
Register xlo = opr1->as_register_lo();
Register xhi = opr1->as_register_hi();
if (opr2->is_double_cpu()) {
-#ifdef _LP64
__ cmpptr(xlo, opr2->as_register_lo());
-#else
- // cpu register - cpu register
- Register ylo = opr2->as_register_lo();
- Register yhi = opr2->as_register_hi();
- __ subl(xlo, ylo);
- __ sbbl(xhi, yhi);
- if (condition == lir_cond_equal || condition == lir_cond_notEqual) {
- __ orl(xhi, xlo);
- }
-#endif // _LP64
} else if (opr2->is_constant()) {
// cpu register - constant 0
assert(opr2->as_jlong() == (jlong)0, "only handles zero");
-#ifdef _LP64
__ cmpptr(xlo, (int32_t)opr2->as_jlong());
-#else
- assert(condition == lir_cond_equal || condition == lir_cond_notEqual, "only handles equals case");
- __ orl(xhi, xlo);
-#endif // _LP64
} else {
ShouldNotReachHere();
}
@@ -2398,12 +2100,10 @@ void LIR_Assembler::comp_op(LIR_Condition condition, LIR_Opr opr1, LIR_Opr opr2,
} else if (opr1->is_address() && opr2->is_constant()) {
LIR_Const* c = opr2->as_constant_ptr();
-#ifdef _LP64
if (is_reference_type(c->type())) {
assert(condition == lir_cond_equal || condition == lir_cond_notEqual, "need to reverse");
__ movoop(rscratch1, c->as_jobject());
}
-#endif // LP64
if (op->info() != nullptr) {
add_debug_info_for_null_check_here(op->info());
}
@@ -2412,13 +2112,9 @@ void LIR_Assembler::comp_op(LIR_Condition condition, LIR_Opr opr1, LIR_Opr opr2,
if (c->type() == T_INT) {
__ cmpl(as_Address(addr), c->as_jint());
} else if (is_reference_type(c->type())) {
-#ifdef _LP64
// %%% Make this explode if addr isn't reachable until we figure out a
// better strategy by giving noreg as the temp for as_Address
__ cmpoop(rscratch1, as_Address(addr, noreg));
-#else
- __ cmpoop(as_Address(addr), c->as_jobject());
-#endif // _LP64
} else {
ShouldNotReachHere();
}
@@ -2442,7 +2138,6 @@ void LIR_Assembler::comp_fl2i(LIR_Code code, LIR_Opr left, LIR_Opr right, LIR_Op
}
} else {
assert(code == lir_cmp_l2i, "check");
-#ifdef _LP64
Label done;
Register dest = dst->as_register();
__ cmpptr(left->as_register_lo(), right->as_register_lo());
@@ -2451,13 +2146,6 @@ void LIR_Assembler::comp_fl2i(LIR_Code code, LIR_Opr left, LIR_Opr right, LIR_Op
__ setb(Assembler::notZero, dest);
__ movzbl(dest, dest);
__ bind(done);
-#else
- __ lcmp2int(left->as_register_hi(),
- left->as_register_lo(),
- right->as_register_hi(),
- right->as_register_lo());
- move_regs(left->as_register_hi(), dst->as_register());
-#endif // _LP64
}
}
@@ -2583,22 +2271,12 @@ void LIR_Assembler::shift_op(LIR_Code code, LIR_Opr left, LIR_Opr count, LIR_Opr
Register lo = left->as_register_lo();
Register hi = left->as_register_hi();
assert(lo != SHIFT_count && hi != SHIFT_count, "left cannot be ECX");
-#ifdef _LP64
switch (code) {
case lir_shl: __ shlptr(lo); break;
case lir_shr: __ sarptr(lo); break;
case lir_ushr: __ shrptr(lo); break;
default: ShouldNotReachHere();
}
-#else
-
- switch (code) {
- case lir_shl: __ lshl(hi, lo); break;
- case lir_shr: __ lshr(hi, lo, true); break;
- case lir_ushr: __ lshr(hi, lo, false); break;
- default: ShouldNotReachHere();
- }
-#endif // LP64
} else {
ShouldNotReachHere();
}
@@ -2619,9 +2297,6 @@ void LIR_Assembler::shift_op(LIR_Code code, LIR_Opr left, jint count, LIR_Opr de
default: ShouldNotReachHere();
}
} else if (dest->is_double_cpu()) {
-#ifndef _LP64
- Unimplemented();
-#else
// first move left into dest so that left is not destroyed by the shift
Register value = dest->as_register_lo();
count = count & 0x1F; // Java spec
@@ -2633,7 +2308,6 @@ void LIR_Assembler::shift_op(LIR_Code code, LIR_Opr left, jint count, LIR_Opr de
case lir_ushr: __ shrptr(value, count); break;
default: ShouldNotReachHere();
}
-#endif // _LP64
} else {
ShouldNotReachHere();
}
@@ -2683,7 +2357,7 @@ void LIR_Assembler::emit_arraycopy(LIR_OpArrayCopy* op) {
Register dst_pos = op->dst_pos()->as_register();
Register length = op->length()->as_register();
Register tmp = op->tmp()->as_register();
- Register tmp_load_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg);
+ Register tmp_load_klass = rscratch1;
Register tmp2 = UseCompactObjectHeaders ? rscratch2 : noreg;
CodeStub* stub = op->stub();
@@ -2708,13 +2382,11 @@ void LIR_Assembler::emit_arraycopy(LIR_OpArrayCopy* op) {
// these are just temporary placements until we need to reload
store_parameter(src_pos, 3);
store_parameter(src, 4);
- NOT_LP64(assert(src == rcx && src_pos == rdx, "mismatch in calling convention");)
address copyfunc_addr = StubRoutines::generic_arraycopy();
assert(copyfunc_addr != nullptr, "generic arraycopy stub required");
// pass arguments: may push as this is not a safepoint; SP must be fix at each safepoint
-#ifdef _LP64
// The arguments are in java calling convention so we can trivially shift them to C
// convention
assert_different_registers(c_rarg0, j_rarg1, j_rarg2, j_rarg3, j_rarg4);
@@ -2745,21 +2417,6 @@ void LIR_Assembler::emit_arraycopy(LIR_OpArrayCopy* op) {
#endif
__ call(RuntimeAddress(copyfunc_addr));
#endif // _WIN64
-#else
- __ push(length);
- __ push(dst_pos);
- __ push(dst);
- __ push(src_pos);
- __ push(src);
-
-#ifndef PRODUCT
- if (PrintC1Statistics) {
- __ incrementl(ExternalAddress((address)&Runtime1::_generic_arraycopystub_cnt), rscratch1);
- }
-#endif
- __ call_VM_leaf(copyfunc_addr, 5); // removes pushed parameter from the stack
-
-#endif // _LP64
__ testl(rax, rax);
__ jcc(Assembler::equal, *stub->continuation());
@@ -2865,10 +2522,8 @@ void LIR_Assembler::emit_arraycopy(LIR_OpArrayCopy* op) {
__ jcc(Assembler::less, *stub->entry());
}
-#ifdef _LP64
__ movl2ptr(src_pos, src_pos); //higher 32bits must be null
__ movl2ptr(dst_pos, dst_pos); //higher 32bits must be null
-#endif
if (flags & LIR_OpArrayCopy::type_check) {
// We don't know the array types are compatible
@@ -2932,21 +2587,6 @@ void LIR_Assembler::emit_arraycopy(LIR_OpArrayCopy* op) {
store_parameter(src_pos, 3);
store_parameter(src, 4);
-#ifndef _LP64
- Address dst_klass_addr = Address(dst, oopDesc::klass_offset_in_bytes());
- __ movptr(tmp, dst_klass_addr);
- __ movptr(tmp, Address(tmp, ObjArrayKlass::element_klass_offset()));
- __ push(tmp);
- __ movl(tmp, Address(tmp, Klass::super_check_offset_offset()));
- __ push(tmp);
- __ push(length);
- __ lea(tmp, Address(dst, dst_pos, scale, arrayOopDesc::base_offset_in_bytes(basic_type)));
- __ push(tmp);
- __ lea(tmp, Address(src, src_pos, scale, arrayOopDesc::base_offset_in_bytes(basic_type)));
- __ push(tmp);
-
- __ call_VM_leaf(copyfunc_addr, 5);
-#else
__ movl2ptr(length, length); //higher 32bits must be null
__ lea(c_rarg0, Address(src, src_pos, scale, arrayOopDesc::base_offset_in_bytes(basic_type)));
@@ -2973,8 +2613,6 @@ void LIR_Assembler::emit_arraycopy(LIR_OpArrayCopy* op) {
__ call(RuntimeAddress(copyfunc_addr));
#endif
-#endif
-
#ifndef PRODUCT
if (PrintC1Statistics) {
Label failed;
@@ -3030,11 +2668,9 @@ void LIR_Assembler::emit_arraycopy(LIR_OpArrayCopy* op) {
// but not necessarily exactly of type default_type.
Label known_ok, halt;
__ mov_metadata(tmp, default_type->constant_encoding());
-#ifdef _LP64
if (UseCompressedClassPointers) {
__ encode_klass_not_null(tmp, rscratch1);
}
-#endif
if (basic_type != T_OBJECT) {
__ cmp_klass(tmp, dst, tmp2);
@@ -3059,21 +2695,12 @@ void LIR_Assembler::emit_arraycopy(LIR_OpArrayCopy* op) {
}
#endif
-#ifdef _LP64
assert_different_registers(c_rarg0, dst, dst_pos, length);
__ lea(c_rarg0, Address(src, src_pos, scale, arrayOopDesc::base_offset_in_bytes(basic_type)));
assert_different_registers(c_rarg1, length);
__ lea(c_rarg1, Address(dst, dst_pos, scale, arrayOopDesc::base_offset_in_bytes(basic_type)));
__ mov(c_rarg2, length);
-#else
- __ lea(tmp, Address(src, src_pos, scale, arrayOopDesc::base_offset_in_bytes(basic_type)));
- store_parameter(tmp, 0);
- __ lea(tmp, Address(dst, dst_pos, scale, arrayOopDesc::base_offset_in_bytes(basic_type)));
- store_parameter(tmp, 1);
- store_parameter(length, 2);
-#endif // _LP64
-
bool disjoint = (flags & LIR_OpArrayCopy::overlapping) == 0;
bool aligned = (flags & LIR_OpArrayCopy::unaligned) == 0;
const char *name;
@@ -3146,7 +2773,7 @@ void LIR_Assembler::emit_profile_call(LIR_OpProfileCall* op) {
ciMethod* method = op->profiled_method();
int bci = op->profiled_bci();
ciMethod* callee = op->profiled_callee();
- Register tmp_load_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg);
+ Register tmp_load_klass = rscratch1;
// Update counter for all call types
ciMethodData* md = method->method_data_or_null();
@@ -3217,7 +2844,7 @@ void LIR_Assembler::emit_profile_call(LIR_OpProfileCall* op) {
void LIR_Assembler::emit_profile_type(LIR_OpProfileType* op) {
Register obj = op->obj()->as_register();
Register tmp = op->tmp()->as_pointer_register();
- Register tmp_load_klass = LP64_ONLY(rscratch1) NOT_LP64(noreg);
+ Register tmp_load_klass = rscratch1;
Address mdo_addr = as_Address(op->mdp()->as_address_ptr());
ciKlass* exact_klass = op->exact_klass();
intptr_t current_klass = op->current_klass();
@@ -3237,17 +2864,9 @@ void LIR_Assembler::emit_profile_type(LIR_OpProfileType* op) {
#ifdef ASSERT
if (obj == tmp) {
-#ifdef _LP64
assert_different_registers(obj, rscratch1, mdo_addr.base(), mdo_addr.index());
-#else
- assert_different_registers(obj, mdo_addr.base(), mdo_addr.index());
-#endif
} else {
-#ifdef _LP64
assert_different_registers(obj, tmp, rscratch1, mdo_addr.base(), mdo_addr.index());
-#else
- assert_different_registers(obj, tmp, mdo_addr.base(), mdo_addr.index());
-#endif
}
#endif
if (do_null) {
@@ -3301,9 +2920,7 @@ void LIR_Assembler::emit_profile_type(LIR_OpProfileType* op) {
} else {
__ load_klass(tmp, obj, tmp_load_klass);
}
-#ifdef _LP64
__ mov(rscratch1, tmp); // save original value before XOR
-#endif
__ xorptr(tmp, mdo_addr);
__ testptr(tmp, TypeEntries::type_klass_mask);
// klass seen before, nothing to do. The unknown bit may have been
@@ -3316,7 +2933,6 @@ void LIR_Assembler::emit_profile_type(LIR_OpProfileType* op) {
if (TypeEntries::is_type_none(current_klass)) {
__ testptr(mdo_addr, TypeEntries::type_mask);
__ jccb(Assembler::zero, none);
-#ifdef _LP64
// There is a chance that the checks above (re-reading profiling
// data from memory) fail if another thread has just set the
// profiling to this obj's klass
@@ -3324,7 +2940,6 @@ void LIR_Assembler::emit_profile_type(LIR_OpProfileType* op) {
__ xorptr(tmp, mdo_addr);
__ testptr(tmp, TypeEntries::type_klass_mask);
__ jccb(Assembler::zero, next);
-#endif
}
} else {
assert(ciTypeEntries::valid_ciklass(current_klass) != nullptr &&
@@ -3418,22 +3033,9 @@ void LIR_Assembler::negate(LIR_Opr left, LIR_Opr dest, LIR_Opr tmp) {
} else if (left->is_double_cpu()) {
Register lo = left->as_register_lo();
-#ifdef _LP64
Register dst = dest->as_register_lo();
__ movptr(dst, lo);
__ negptr(dst);
-#else
- Register hi = left->as_register_hi();
- __ lneg(hi, lo);
- if (dest->as_register_lo() == hi) {
- assert(dest->as_register_hi() != lo, "destroying register");
- move_regs(hi, dest->as_register_hi());
- move_regs(lo, dest->as_register_lo());
- } else {
- move_regs(lo, dest->as_register_lo());
- move_regs(hi, dest->as_register_hi());
- }
-#endif // _LP64
} else if (dest->is_single_xmm()) {
assert(!tmp->is_valid(), "do not need temporary");
@@ -3496,13 +3098,7 @@ void LIR_Assembler::volatile_move_op(LIR_Opr src, LIR_Opr dest, BasicType type,
if (src->is_double_xmm()) {
if (dest->is_double_cpu()) {
-#ifdef _LP64
__ movdq(dest->as_register_lo(), src->as_xmm_double_reg());
-#else
- __ movdl(dest->as_register_lo(), src->as_xmm_double_reg());
- __ psrlq(src->as_xmm_double_reg(), 32);
- __ movdl(dest->as_register_hi(), src->as_xmm_double_reg());
-#endif // _LP64
} else if (dest->is_double_stack()) {
__ movdbl(frame_map()->address_for_slot(dest->double_stack_ix()), src->as_xmm_double_reg());
} else if (dest->is_address()) {
@@ -3519,6 +3115,7 @@ void LIR_Assembler::volatile_move_op(LIR_Opr src, LIR_Opr dest, BasicType type,
} else {
ShouldNotReachHere();
}
+
} else {
ShouldNotReachHere();
}
@@ -3601,12 +3198,7 @@ void LIR_Assembler::on_spin_wait() {
void LIR_Assembler::get_thread(LIR_Opr result_reg) {
assert(result_reg->is_register(), "check");
-#ifdef _LP64
- // __ get_thread(result_reg->as_register_lo());
__ mov(result_reg->as_register(), r15_thread);
-#else
- __ get_thread(result_reg->as_register());
-#endif // _LP64
}
@@ -3627,7 +3219,6 @@ void LIR_Assembler::atomic_op(LIR_Code code, LIR_Opr src, LIR_Opr data, LIR_Opr
} else if (data->is_oop()) {
assert (code == lir_xchg, "xadd for oops");
Register obj = data->as_register();
-#ifdef _LP64
if (UseCompressedOops) {
__ encode_heap_oop(obj);
__ xchgl(obj, as_Address(src->as_address_ptr()));
@@ -3635,11 +3226,7 @@ void LIR_Assembler::atomic_op(LIR_Code code, LIR_Opr src, LIR_Opr data, LIR_Opr
} else {
__ xchgptr(obj, as_Address(src->as_address_ptr()));
}
-#else
- __ xchgl(obj, as_Address(src->as_address_ptr()));
-#endif
} else if (data->type() == T_LONG) {
-#ifdef _LP64
assert(data->as_register_lo() == data->as_register_hi(), "should be a single register");
if (code == lir_xadd) {
__ lock();
@@ -3647,9 +3234,6 @@ void LIR_Assembler::atomic_op(LIR_Code code, LIR_Opr src, LIR_Opr data, LIR_Opr
} else {
__ xchgq(data->as_register_lo(), as_Address(src->as_address_ptr()));
}
-#else
- ShouldNotReachHere();
-#endif
} else {
ShouldNotReachHere();
}
diff --git a/src/hotspot/cpu/x86/c1_LIRAssembler_x86.hpp b/src/hotspot/cpu/x86/c1_LIRAssembler_x86.hpp
index c8f97cece6d8b..8524dc90276f0 100644
--- a/src/hotspot/cpu/x86/c1_LIRAssembler_x86.hpp
+++ b/src/hotspot/cpu/x86/c1_LIRAssembler_x86.hpp
@@ -46,9 +46,9 @@
Register recv, Label* update_done);
enum {
- _call_stub_size = NOT_LP64(15) LP64_ONLY(28),
+ _call_stub_size = 28,
_exception_handler_size = DEBUG_ONLY(1*K) NOT_DEBUG(175),
- _deopt_handler_size = NOT_LP64(10) LP64_ONLY(17)
+ _deopt_handler_size = 17
};
public:
diff --git a/src/hotspot/cpu/x86/c1_LIRGenerator_x86.cpp b/src/hotspot/cpu/x86/c1_LIRGenerator_x86.cpp
index fe6d6a58b00dd..60ce3419dfb42 100644
--- a/src/hotspot/cpu/x86/c1_LIRGenerator_x86.cpp
+++ b/src/hotspot/cpu/x86/c1_LIRGenerator_x86.cpp
@@ -142,7 +142,6 @@ bool LIRGenerator::can_inline_as_constant(LIR_Const* c) const {
LIR_Opr LIRGenerator::safepoint_poll_register() {
- NOT_LP64( return new_register(T_ADDRESS); )
return LIR_OprFact::illegalOpr;
}
@@ -152,7 +151,6 @@ LIR_Address* LIRGenerator::generate_address(LIR_Opr base, LIR_Opr index,
assert(base->is_register(), "must be");
if (index->is_constant()) {
LIR_Const *constant = index->as_constant_ptr();
-#ifdef _LP64
jlong c;
if (constant->type() == T_INT) {
c = (jlong(index->as_jint()) << shift) + disp;
@@ -167,11 +165,6 @@ LIR_Address* LIRGenerator::generate_address(LIR_Opr base, LIR_Opr index,
__ move(index, tmp);
return new LIR_Address(base, tmp, type);
}
-#else
- return new LIR_Address(base,
- ((intx)(constant->as_jint()) << shift) + disp,
- type);
-#endif
} else {
return new LIR_Address(base, index, (LIR_Address::Scale)shift, disp, type);
}
@@ -185,7 +178,6 @@ LIR_Address* LIRGenerator::emit_array_address(LIR_Opr array_opr, LIR_Opr index_o
LIR_Address* addr;
if (index_opr->is_constant()) {
int elem_size = type2aelembytes(type);
-#ifdef _LP64
jint index = index_opr->as_jint();
jlong disp = offset_in_bytes + (jlong)(index) * elem_size;
if (disp > max_jint) {
@@ -197,28 +189,12 @@ LIR_Address* LIRGenerator::emit_array_address(LIR_Opr array_opr, LIR_Opr index_o
} else {
addr = new LIR_Address(array_opr, (intx)disp, type);
}
-#else
- // A displacement overflow can also occur for x86 but that is not a problem due to the 32-bit address range!
- // Let's assume an array 'a' and an access with displacement 'disp'. When disp overflows, then "a + disp" will
- // always be negative (i.e. underflows the 32-bit address range):
- // Let N = 2^32: a + signed_overflow(disp) = a + disp - N.
- // "a + disp" is always smaller than N. If an index was chosen which would point to an address beyond N, then
- // range checks would catch that and throw an exception. Thus, a + disp < 0 holds which means that it always
- // underflows the 32-bit address range:
- // unsigned_underflow(a + signed_overflow(disp)) = unsigned_underflow(a + disp - N)
- // = (a + disp - N) + N = a + disp
- // This shows that we still end up at the correct address with a displacement overflow due to the 32-bit address
- // range limitation. This overflow only needs to be handled if addresses can be larger as on 64-bit platforms.
- addr = new LIR_Address(array_opr, offset_in_bytes + (intx)(index_opr->as_jint()) * elem_size, type);
-#endif // _LP64
} else {
-#ifdef _LP64
if (index_opr->type() == T_INT) {
LIR_Opr tmp = new_register(T_LONG);
__ convert(Bytecodes::_i2l, index_opr, tmp);
index_opr = tmp;
}
-#endif // _LP64
addr = new LIR_Address(array_opr,
index_opr,
LIR_Address::scale(type),
@@ -358,34 +334,12 @@ void LIRGenerator::do_ArithmeticOp_FPU(ArithmeticOp* x) {
left.dont_load_item();
}
-#ifndef _LP64
- // do not load right operand if it is a constant. only 0 and 1 are
- // loaded because there are special instructions for loading them
- // without memory access (not needed for SSE2 instructions)
- bool must_load_right = false;
- if (right.is_constant()) {
- LIR_Const* c = right.result()->as_constant_ptr();
- assert(c != nullptr, "invalid constant");
- assert(c->type() == T_FLOAT || c->type() == T_DOUBLE, "invalid type");
-
- if (c->type() == T_FLOAT) {
- must_load_right = UseSSE < 1 && (c->is_one_float() || c->is_zero_float());
- } else {
- must_load_right = UseSSE < 2 && (c->is_one_double() || c->is_zero_double());
- }
- }
-#endif // !LP64
-
if (must_load_both) {
// frem and drem destroy also right operand, so move it to a new register
right.set_destroys_register();
right.load_item();
} else if (right.is_register()) {
right.load_item();
-#ifndef _LP64
- } else if (must_load_right) {
- right.load_item();
-#endif // !LP64
} else {
right.dont_load_item();
}
@@ -395,7 +349,6 @@ void LIRGenerator::do_ArithmeticOp_FPU(ArithmeticOp* x) {
tmp = new_register(T_DOUBLE);
}
-#ifdef _LP64
if (x->op() == Bytecodes::_frem || x->op() == Bytecodes::_drem) {
// frem and drem are implemented as a direct call into the runtime.
LIRItem left(x->x(), this);
@@ -430,27 +383,6 @@ void LIRGenerator::do_ArithmeticOp_FPU(ArithmeticOp* x) {
arithmetic_op_fpu(x->op(), reg, left.result(), right.result(), tmp);
set_result(x, reg);
}
-#else
- if ((UseSSE >= 1 && x->op() == Bytecodes::_frem) || (UseSSE >= 2 && x->op() == Bytecodes::_drem)) {
- // special handling for frem and drem: no SSE instruction, so must use FPU with temporary fpu stack slots
- LIR_Opr fpu0, fpu1;
- if (x->op() == Bytecodes::_frem) {
- fpu0 = LIR_OprFact::single_fpu(0);
- fpu1 = LIR_OprFact::single_fpu(1);
- } else {
- fpu0 = LIR_OprFact::double_fpu(0);
- fpu1 = LIR_OprFact::double_fpu(1);
- }
- __ move(right.result(), fpu1); // order of left and right operand is important!
- __ move(left.result(), fpu0);
- __ rem (fpu0, fpu1, fpu0);
- __ move(fpu0, reg);
-
- } else {
- arithmetic_op_fpu(x->op(), reg, left.result(), right.result(), tmp);
- }
- set_result(x, reg);
-#endif // _LP64
}
@@ -740,7 +672,7 @@ LIR_Opr LIRGenerator::atomic_xchg(BasicType type, LIR_Opr addr, LIRItem& value)
value.load_item();
// Because we want a 2-arg form of xchg and xadd
__ move(value.result(), result);
- assert(type == T_INT || is_oop LP64_ONLY( || type == T_LONG ), "unexpected type");
+ assert(type == T_INT || is_oop || type == T_LONG, "unexpected type");
__ xchg(addr, result, result, LIR_OprFact::illegalOpr);
return result;
}
@@ -750,7 +682,7 @@ LIR_Opr LIRGenerator::atomic_add(BasicType type, LIR_Opr addr, LIRItem& value) {
value.load_item();
// Because we want a 2-arg form of xchg and xadd
__ move(value.result(), result);
- assert(type == T_INT LP64_ONLY( || type == T_LONG ), "unexpected type");
+ assert(type == T_INT || type == T_LONG, "unexpected type");
__ xadd(addr, result, result, LIR_OprFact::illegalOpr);
return result;
}
@@ -788,10 +720,7 @@ void LIRGenerator::do_MathIntrinsic(Intrinsic* x) {
if (x->id() == vmIntrinsics::_dexp || x->id() == vmIntrinsics::_dlog ||
x->id() == vmIntrinsics::_dpow || x->id() == vmIntrinsics::_dcos ||
x->id() == vmIntrinsics::_dsin || x->id() == vmIntrinsics::_dtan ||
- x->id() == vmIntrinsics::_dlog10
-#ifdef _LP64
- || x->id() == vmIntrinsics::_dtanh
-#endif
+ x->id() == vmIntrinsics::_dlog10 || x->id() == vmIntrinsics::_dtanh
) {
do_LibmIntrinsic(x);
return;
@@ -799,12 +728,6 @@ void LIRGenerator::do_MathIntrinsic(Intrinsic* x) {
LIRItem value(x->argument_at(0), this);
- bool use_fpu = false;
-#ifndef _LP64
- if (UseSSE < 2) {
- value.set_destroys_register();
- }
-#endif // !LP64
value.load_item();
LIR_Opr calc_input = value.result();
@@ -832,10 +755,6 @@ void LIRGenerator::do_MathIntrinsic(Intrinsic* x) {
default:
ShouldNotReachHere();
}
-
- if (use_fpu) {
- __ move(calc_result, x->operand());
- }
}
void LIRGenerator::do_LibmIntrinsic(Intrinsic* x) {
@@ -956,20 +875,6 @@ void LIRGenerator::do_ArrayCopy(Intrinsic* x) {
flags = 0;
}
-#ifndef _LP64
- src.load_item_force (FrameMap::rcx_oop_opr);
- src_pos.load_item_force (FrameMap::rdx_opr);
- dst.load_item_force (FrameMap::rax_oop_opr);
- dst_pos.load_item_force (FrameMap::rbx_opr);
- length.load_item_force (FrameMap::rdi_opr);
- LIR_Opr tmp = (FrameMap::rsi_opr);
-
- if (expected_type != nullptr && flags == 0) {
- FrameMap* f = Compilation::current()->frame_map();
- f->update_reserved_argument_area_size(3 * BytesPerWord);
- }
-#else
-
// The java calling convention will give us enough registers
// so that on the stub side the args will be perfect already.
// On the other slow/special case side we call C and the arg
@@ -985,7 +890,6 @@ void LIRGenerator::do_ArrayCopy(Intrinsic* x) {
length.load_item_force (FrameMap::as_opr(j_rarg4));
LIR_Opr tmp = FrameMap::as_opr(j_rarg5);
-#endif // LP64
set_no_result(x);
@@ -1027,18 +931,11 @@ void LIRGenerator::do_update_CRC32(Intrinsic* x) {
}
LIR_Opr base_op = buf.result();
-#ifndef _LP64
- if (!is_updateBytes) { // long b raw address
- base_op = new_register(T_INT);
- __ convert(Bytecodes::_l2i, buf.result(), base_op);
- }
-#else
if (index->is_valid()) {
LIR_Opr tmp = new_register(T_LONG);
__ convert(Bytecodes::_i2l, index, tmp);
index = tmp;
}
-#endif
LIR_Address* a = new LIR_Address(base_op,
index,
@@ -1172,14 +1069,6 @@ void LIRGenerator::do_vectorizedMismatch(Intrinsic* x) {
}
LIR_Opr result_b = b.result();
-#ifndef _LP64
- result_a = new_register(T_INT);
- __ convert(Bytecodes::_l2i, a.result(), result_a);
- result_b = new_register(T_INT);
- __ convert(Bytecodes::_l2i, b.result(), result_b);
-#endif
-
-
LIR_Address* addr_a = new LIR_Address(result_a,
result_aOffset,
constant_aOffset,
@@ -1214,7 +1103,6 @@ void LIRGenerator::do_vectorizedMismatch(Intrinsic* x) {
}
void LIRGenerator::do_Convert(Convert* x) {
-#ifdef _LP64
LIRItem value(x->value(), this);
value.load_item();
LIR_Opr input = value.result();
@@ -1222,66 +1110,6 @@ void LIRGenerator::do_Convert(Convert* x) {
__ convert(x->op(), input, result);
assert(result->is_virtual(), "result must be virtual register");
set_result(x, result);
-#else
- // flags that vary for the different operations and different SSE-settings
- bool fixed_input = false, fixed_result = false, round_result = false, needs_stub = false;
-
- switch (x->op()) {
- case Bytecodes::_i2l: // fall through
- case Bytecodes::_l2i: // fall through
- case Bytecodes::_i2b: // fall through
- case Bytecodes::_i2c: // fall through
- case Bytecodes::_i2s: fixed_input = false; fixed_result = false; round_result = false; needs_stub = false; break;
-
- case Bytecodes::_f2d: fixed_input = UseSSE == 1; fixed_result = false; round_result = false; needs_stub = false; break;
- case Bytecodes::_d2f: fixed_input = false; fixed_result = UseSSE == 1; round_result = UseSSE < 1; needs_stub = false; break;
- case Bytecodes::_i2f: fixed_input = false; fixed_result = false; round_result = UseSSE < 1; needs_stub = false; break;
- case Bytecodes::_i2d: fixed_input = false; fixed_result = false; round_result = false; needs_stub = false; break;
- case Bytecodes::_f2i: fixed_input = false; fixed_result = false; round_result = false; needs_stub = true; break;
- case Bytecodes::_d2i: fixed_input = false; fixed_result = false; round_result = false; needs_stub = true; break;
- case Bytecodes::_l2f: fixed_input = false; fixed_result = UseSSE >= 1; round_result = UseSSE < 1; needs_stub = false; break;
- case Bytecodes::_l2d: fixed_input = false; fixed_result = UseSSE >= 2; round_result = UseSSE < 2; needs_stub = false; break;
- case Bytecodes::_f2l: fixed_input = true; fixed_result = true; round_result = false; needs_stub = false; break;
- case Bytecodes::_d2l: fixed_input = true; fixed_result = true; round_result = false; needs_stub = false; break;
- default: ShouldNotReachHere();
- }
-
- LIRItem value(x->value(), this);
- value.load_item();
- LIR_Opr input = value.result();
- LIR_Opr result = rlock(x);
-
- // arguments of lir_convert
- LIR_Opr conv_input = input;
- LIR_Opr conv_result = result;
- ConversionStub* stub = nullptr;
-
- if (fixed_input) {
- conv_input = fixed_register_for(input->type());
- __ move(input, conv_input);
- }
-
- assert(fixed_result == false || round_result == false, "cannot set both");
- if (fixed_result) {
- conv_result = fixed_register_for(result->type());
- } else if (round_result) {
- result = new_register(result->type());
- set_vreg_flag(result, must_start_in_memory);
- }
-
- if (needs_stub) {
- stub = new ConversionStub(x->op(), conv_input, conv_result);
- }
-
- __ convert(x->op(), conv_input, conv_result, stub);
-
- if (result != conv_result) {
- __ move(conv_result, result);
- }
-
- assert(result->is_virtual(), "result must be virtual register");
- set_result(x, result);
-#endif // _LP64
}
@@ -1547,13 +1375,7 @@ void LIRGenerator::do_If(If* x) {
LIR_Opr LIRGenerator::getThreadPointer() {
-#ifdef _LP64
return FrameMap::as_pointer_opr(r15_thread);
-#else
- LIR_Opr result = new_register(T_INT);
- __ get_thread(result);
- return result;
-#endif //
}
void LIRGenerator::trace_block_entry(BlockBegin* block) {
@@ -1598,12 +1420,6 @@ void LIRGenerator::volatile_field_load(LIR_Address* address, LIR_Opr result,
LIR_Opr temp_double = new_register(T_DOUBLE);
__ volatile_move(LIR_OprFact::address(address), temp_double, T_LONG, info);
__ volatile_move(temp_double, result, T_LONG);
-#ifndef _LP64
- if (UseSSE < 2) {
- // no spill slot needed in SSE2 mode because xmm->cpu register move is possible
- set_vreg_flag(result, must_start_in_memory);
- }
-#endif // !LP64
} else {
__ load(address, result, info);
}
diff --git a/src/hotspot/cpu/x86/c1_LIR_x86.cpp b/src/hotspot/cpu/x86/c1_LIR_x86.cpp
index adcc53c44ce14..ce831c5f95649 100644
--- a/src/hotspot/cpu/x86/c1_LIR_x86.cpp
+++ b/src/hotspot/cpu/x86/c1_LIR_x86.cpp
@@ -58,16 +58,9 @@ LIR_Opr LIR_OprFact::double_fpu(int reg1, int reg2) {
#ifndef PRODUCT
void LIR_Address::verify() const {
-#ifdef _LP64
assert(base()->is_cpu_register(), "wrong base operand");
assert(index()->is_illegal() || index()->is_double_cpu(), "wrong index operand");
assert(base()->type() == T_ADDRESS || base()->type() == T_OBJECT || base()->type() == T_LONG || base()->type() == T_METADATA,
"wrong type for addresses");
-#else
- assert(base()->is_single_cpu(), "wrong base operand");
- assert(index()->is_illegal() || index()->is_single_cpu(), "wrong index operand");
- assert(base()->type() == T_ADDRESS || base()->type() == T_OBJECT || base()->type() == T_INT || base()->type() == T_METADATA,
- "wrong type for addresses");
-#endif
}
#endif // PRODUCT
diff --git a/src/hotspot/cpu/x86/c1_LinearScan_x86.hpp b/src/hotspot/cpu/x86/c1_LinearScan_x86.hpp
index 8669c9ab8a10b..dcc9a77765a8f 100644
--- a/src/hotspot/cpu/x86/c1_LinearScan_x86.hpp
+++ b/src/hotspot/cpu/x86/c1_LinearScan_x86.hpp
@@ -26,12 +26,6 @@
#define CPU_X86_C1_LINEARSCAN_X86_HPP
inline bool LinearScan::is_processed_reg_num(int reg_num) {
-#ifndef _LP64
- // rsp and rbp (numbers 6 ancd 7) are ignored
- assert(FrameMap::rsp_opr->cpu_regnr() == 6, "wrong assumption below");
- assert(FrameMap::rbp_opr->cpu_regnr() == 7, "wrong assumption below");
- assert(reg_num >= 0, "invalid reg_num");
-#else
// rsp and rbp, r10, r15 (numbers [12,15]) are ignored
// r12 (number 11) is conditional on compressed oops.
assert(FrameMap::r12_opr->cpu_regnr() == 11, "wrong assumption below");
@@ -40,16 +34,10 @@ inline bool LinearScan::is_processed_reg_num(int reg_num) {
assert(FrameMap::rsp_opr->cpu_regnrLo() == 14, "wrong assumption below");
assert(FrameMap::rbp_opr->cpu_regnrLo() == 15, "wrong assumption below");
assert(reg_num >= 0, "invalid reg_num");
-#endif // _LP64
return reg_num <= FrameMap::last_cpu_reg() || reg_num >= pd_nof_cpu_regs_frame_map;
}
inline int LinearScan::num_physical_regs(BasicType type) {
- // Intel requires two cpu registers for long,
- // but requires only one fpu register for double
- if (LP64_ONLY(false &&) type == T_LONG) {
- return 2;
- }
return 1;
}
@@ -79,7 +67,7 @@ inline bool LinearScanWalker::pd_init_regs_for_alloc(Interval* cur) {
_first_reg = pd_first_byte_reg;
_last_reg = FrameMap::last_byte_reg();
return true;
- } else if ((UseSSE >= 1 && cur->type() == T_FLOAT) || (UseSSE >= 2 && cur->type() == T_DOUBLE)) {
+ } else if (cur->type() == T_FLOAT || cur->type() == T_DOUBLE) {
_first_reg = pd_first_xmm_reg;
_last_reg = last_xmm_reg;
return true;
diff --git a/src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp b/src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp
index 238a1bd048a9f..684347e35fa40 100644
--- a/src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp
+++ b/src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp
@@ -55,24 +55,17 @@ int C1_MacroAssembler::lock_object(Register hdr, Register obj, Register disp_hdr
null_check_offset = offset();
- if (DiagnoseSyncOnValueBasedClasses != 0) {
- load_klass(hdr, obj, rscratch1);
- testb(Address(hdr, Klass::misc_flags_offset()), KlassFlags::_misc_is_value_based_class);
- jcc(Assembler::notZero, slow_case);
- }
-
if (LockingMode == LM_LIGHTWEIGHT) {
-#ifdef _LP64
- const Register thread = r15_thread;
- lightweight_lock(disp_hdr, obj, hdr, thread, tmp, slow_case);
-#else
- // Implicit null check.
- movptr(hdr, Address(obj, oopDesc::mark_offset_in_bytes()));
- // Lacking registers and thread on x86_32. Always take slow path.
- jmp(slow_case);
-#endif
+ lightweight_lock(disp_hdr, obj, hdr, tmp, slow_case);
} else if (LockingMode == LM_LEGACY) {
Label done;
+
+ if (DiagnoseSyncOnValueBasedClasses != 0) {
+ load_klass(hdr, obj, rscratch1);
+ testb(Address(hdr, Klass::misc_flags_offset()), KlassFlags::_misc_is_value_based_class);
+ jcc(Assembler::notZero, slow_case);
+ }
+
// Load object header
movptr(hdr, Address(obj, hdr_offset));
// and mark it as unlocked
@@ -135,12 +128,7 @@ void C1_MacroAssembler::unlock_object(Register hdr, Register obj, Register disp_
verify_oop(obj);
if (LockingMode == LM_LIGHTWEIGHT) {
-#ifdef _LP64
- lightweight_unlock(obj, disp_hdr, r15_thread, hdr, slow_case);
-#else
- // Lacking registers and thread on x86_32. Always take slow path.
- jmp(slow_case);
-#endif
+ lightweight_unlock(obj, disp_hdr, hdr, slow_case);
} else if (LockingMode == LM_LEGACY) {
// test if object header is pointing to the displaced header, and if so, restore
// the displaced header in the object - if the object header is not pointing to
@@ -160,7 +148,7 @@ void C1_MacroAssembler::unlock_object(Register hdr, Register obj, Register disp_
// Defines obj, preserves var_size_in_bytes
void C1_MacroAssembler::try_allocate(Register obj, Register var_size_in_bytes, int con_size_in_bytes, Register t1, Register t2, Label& slow_case) {
if (UseTLAB) {
- tlab_allocate(noreg, obj, var_size_in_bytes, con_size_in_bytes, t1, t2, slow_case);
+ tlab_allocate(obj, var_size_in_bytes, con_size_in_bytes, t1, t2, slow_case);
} else {
jmp(slow_case);
}
@@ -169,7 +157,6 @@ void C1_MacroAssembler::try_allocate(Register obj, Register var_size_in_bytes, i
void C1_MacroAssembler::initialize_header(Register obj, Register klass, Register len, Register t1, Register t2) {
assert_different_registers(obj, klass, len, t1, t2);
-#ifdef _LP64
if (UseCompactObjectHeaders) {
movptr(t1, Address(klass, Klass::prototype_header_offset()));
movptr(Address(obj, oopDesc::mark_offset_in_bytes()), t1);
@@ -178,16 +165,13 @@ void C1_MacroAssembler::initialize_header(Register obj, Register klass, Register
movptr(t1, klass);
encode_klass_not_null(t1, rscratch1);
movl(Address(obj, oopDesc::klass_offset_in_bytes()), t1);
- } else
-#endif
- {
+ } else {
movptr(Address(obj, oopDesc::mark_offset_in_bytes()), checked_cast(markWord::prototype().value()));
movptr(Address(obj, oopDesc::klass_offset_in_bytes()), klass);
}
if (len->is_valid()) {
movl(Address(obj, arrayOopDesc::length_offset_in_bytes()), len);
-#ifdef _LP64
int base_offset = arrayOopDesc::length_offset_in_bytes() + BytesPerInt;
if (!is_aligned(base_offset, BytesPerWord)) {
assert(is_aligned(base_offset, BytesPerInt), "must be 4-byte aligned");
@@ -195,14 +179,10 @@ void C1_MacroAssembler::initialize_header(Register obj, Register klass, Register
xorl(t1, t1);
movl(Address(obj, base_offset), t1);
}
-#endif
- }
-#ifdef _LP64
- else if (UseCompressedClassPointers && !UseCompactObjectHeaders) {
+ } else if (UseCompressedClassPointers && !UseCompactObjectHeaders) {
xorptr(t1, t1);
store_klass_gap(obj, t1);
}
-#endif
}
@@ -265,8 +245,6 @@ void C1_MacroAssembler::initialize_object(Register obj, Register klass, Register
bind(loop);
movptr(Address(obj, index, Address::times_8, hdr_size_in_bytes - (1*BytesPerWord)),
t1_zero);
- NOT_LP64(movptr(Address(obj, index, Address::times_8, hdr_size_in_bytes - (2*BytesPerWord)),
- t1_zero);)
decrement(index);
jcc(Assembler::notZero, loop);
}
@@ -347,11 +325,11 @@ void C1_MacroAssembler::remove_frame(int frame_size_in_bytes) {
void C1_MacroAssembler::verified_entry(bool breakAtEntry) {
- if (breakAtEntry || VerifyFPU) {
+ if (breakAtEntry) {
// Verified Entry first instruction should be 5 bytes long for correct
// patching by patch_verified_entry().
//
- // Breakpoint and VerifyFPU have one byte first instruction.
+ // Breakpoint has one byte first instruction.
// Also first instruction will be one byte "push(rbp)" if stack banging
// code is not generated (see build_frame() above).
// For all these cases generate long instruction first.
@@ -359,7 +337,6 @@ void C1_MacroAssembler::verified_entry(bool breakAtEntry) {
}
if (breakAtEntry) int3();
// build frame
- IA32_ONLY( verify_FPU(0, "method_entry"); )
}
void C1_MacroAssembler::load_parameter(int offset_in_words, Register reg) {
diff --git a/src/hotspot/cpu/x86/c1_Runtime1_x86.cpp b/src/hotspot/cpu/x86/c1_Runtime1_x86.cpp
index cb4cb3af8c3d4..726574e69e8e6 100644
--- a/src/hotspot/cpu/x86/c1_Runtime1_x86.cpp
+++ b/src/hotspot/cpu/x86/c1_Runtime1_x86.cpp
@@ -51,35 +51,26 @@
int StubAssembler::call_RT(Register oop_result1, Register metadata_result, address entry, int args_size) {
// setup registers
- const Register thread = NOT_LP64(rdi) LP64_ONLY(r15_thread); // is callee-saved register (Visual C++ calling conventions)
+ const Register thread = r15_thread;
assert(!(oop_result1->is_valid() || metadata_result->is_valid()) || oop_result1 != metadata_result, "registers must be different");
assert(oop_result1 != thread && metadata_result != thread, "registers must be different");
assert(args_size >= 0, "illegal args_size");
bool align_stack = false;
-#ifdef _LP64
+
// At a method handle call, the stack may not be properly aligned
// when returning with an exception.
align_stack = (stub_id() == (int)C1StubId::handle_exception_from_callee_id);
-#endif
-#ifdef _LP64
mov(c_rarg0, thread);
set_num_rt_args(0); // Nothing on stack
-#else
- set_num_rt_args(1 + args_size);
-
- // push java thread (becomes first argument of C function)
- get_thread(thread);
- push(thread);
-#endif // _LP64
int call_offset = -1;
if (!align_stack) {
- set_last_Java_frame(thread, noreg, rbp, nullptr, rscratch1);
+ set_last_Java_frame(noreg, rbp, nullptr, rscratch1);
} else {
address the_pc = pc();
call_offset = offset();
- set_last_Java_frame(thread, noreg, rbp, the_pc, rscratch1);
+ set_last_Java_frame(noreg, rbp, the_pc, rscratch1);
andptr(rsp, -(StackAlignmentInBytes)); // Align stack
}
@@ -93,7 +84,7 @@ int StubAssembler::call_RT(Register oop_result1, Register metadata_result, addre
guarantee(thread != rax, "change this code");
push(rax);
{ Label L;
- get_thread(rax);
+ get_thread_slow(rax);
cmpptr(thread, rax);
jcc(Assembler::equal, L);
int3();
@@ -102,10 +93,7 @@ int StubAssembler::call_RT(Register oop_result1, Register metadata_result, addre
}
pop(rax);
#endif
- reset_last_Java_frame(thread, true);
-
- // discard thread and arguments
- NOT_LP64(addptr(rsp, num_rt_args()*BytesPerWord));
+ reset_last_Java_frame(true);
// check for pending exceptions
{ Label L;
@@ -115,10 +103,10 @@ int StubAssembler::call_RT(Register oop_result1, Register metadata_result, addre
movptr(rax, Address(thread, Thread::pending_exception_offset()));
// make sure that the vm_results are cleared
if (oop_result1->is_valid()) {
- movptr(Address(thread, JavaThread::vm_result_offset()), NULL_WORD);
+ movptr(Address(thread, JavaThread::vm_result_oop_offset()), NULL_WORD);
}
if (metadata_result->is_valid()) {
- movptr(Address(thread, JavaThread::vm_result_2_offset()), NULL_WORD);
+ movptr(Address(thread, JavaThread::vm_result_metadata_offset()), NULL_WORD);
}
if (frame_size() == no_frame_size) {
leave();
@@ -132,10 +120,10 @@ int StubAssembler::call_RT(Register oop_result1, Register metadata_result, addre
}
// get oop results if there are any and reset the values in the thread
if (oop_result1->is_valid()) {
- get_vm_result(oop_result1, thread);
+ get_vm_result_oop(oop_result1);
}
if (metadata_result->is_valid()) {
- get_vm_result_2(metadata_result, thread);
+ get_vm_result_metadata(metadata_result);
}
assert(call_offset >= 0, "Should be set");
@@ -144,17 +132,12 @@ int StubAssembler::call_RT(Register oop_result1, Register metadata_result, addre
int StubAssembler::call_RT(Register oop_result1, Register metadata_result, address entry, Register arg1) {
-#ifdef _LP64
mov(c_rarg1, arg1);
-#else
- push(arg1);
-#endif // _LP64
return call_RT(oop_result1, metadata_result, entry, 1);
}
int StubAssembler::call_RT(Register oop_result1, Register metadata_result, address entry, Register arg1, Register arg2) {
-#ifdef _LP64
if (c_rarg1 == arg2) {
if (c_rarg2 == arg1) {
xchgq(arg1, arg2);
@@ -166,16 +149,11 @@ int StubAssembler::call_RT(Register oop_result1, Register metadata_result, addre
mov(c_rarg1, arg1);
mov(c_rarg2, arg2);
}
-#else
- push(arg2);
- push(arg1);
-#endif // _LP64
return call_RT(oop_result1, metadata_result, entry, 2);
}
int StubAssembler::call_RT(Register oop_result1, Register metadata_result, address entry, Register arg1, Register arg2, Register arg3) {
-#ifdef _LP64
// if there is any conflict use the stack
if (arg1 == c_rarg2 || arg1 == c_rarg3 ||
arg2 == c_rarg1 || arg2 == c_rarg3 ||
@@ -191,11 +169,6 @@ int StubAssembler::call_RT(Register oop_result1, Register metadata_result, addre
mov(c_rarg2, arg2);
mov(c_rarg3, arg3);
}
-#else
- push(arg3);
- push(arg2);
- push(arg1);
-#endif // _LP64
return call_RT(oop_result1, metadata_result, entry, 3);
}
@@ -262,20 +235,13 @@ const int xmm_regs_as_doubles_size_in_slots = FrameMap::nof_xmm_regs * 2;
// but the code in save_live_registers will take the argument count into
// account.
//
-#ifdef _LP64
- #define SLOT2(x) x,
- #define SLOT_PER_WORD 2
-#else
- #define SLOT2(x)
- #define SLOT_PER_WORD 1
-#endif // _LP64
+#define SLOT2(x) x,
+#define SLOT_PER_WORD 2
enum reg_save_layout {
// 64bit needs to keep stack 16 byte aligned. So we add some alignment dummies to make that
// happen and will assert if the stack size we create is misaligned
-#ifdef _LP64
align_dummy_0, align_dummy_1,
-#endif // _LP64
#ifdef _WIN64
// Windows always allocates space for it's argument registers (see
// frame::arg_reg_save_area_bytes).
@@ -291,7 +257,6 @@ enum reg_save_layout {
fpu_state_end_off = fpu_state_off + (FPUStateSizeInWords / SLOT_PER_WORD), // 352
marker = fpu_state_end_off, SLOT2(markerH) // 352, 356
extra_space_offset, // 360
-#ifdef _LP64
r15_off = extra_space_offset, r15H_off, // 360, 364
r14_off, r14H_off, // 368, 372
r13_off, r13H_off, // 376, 380
@@ -301,9 +266,6 @@ enum reg_save_layout {
r9_off, r9H_off, // 408, 412
r8_off, r8H_off, // 416, 420
rdi_off, rdiH_off, // 424, 428
-#else
- rdi_off = extra_space_offset,
-#endif // _LP64
rsi_off, SLOT2(rsiH_off) // 432, 436
rbp_off, SLOT2(rbpH_off) // 440, 444
rsp_off, SLOT2(rspH_off) // 448, 452
@@ -329,8 +291,8 @@ static OopMap* generate_oop_map(StubAssembler* sasm, int num_rt_args,
bool save_fpu_registers = true) {
// In 64bit all the args are in regs so there are no additional stack slots
- LP64_ONLY(num_rt_args = 0);
- LP64_ONLY(assert((reg_save_frame_size * VMRegImpl::stack_slot_size) % 16 == 0, "must be 16 byte aligned");)
+ num_rt_args = 0;
+ assert((reg_save_frame_size * VMRegImpl::stack_slot_size) % 16 == 0, "must be 16 byte aligned");
int frame_size_in_slots = reg_save_frame_size + num_rt_args; // args + thread
sasm->set_frame_size(frame_size_in_slots / VMRegImpl::slots_per_word);
@@ -343,7 +305,6 @@ static OopMap* generate_oop_map(StubAssembler* sasm, int num_rt_args,
map->set_callee_saved(VMRegImpl::stack2reg(rbx_off + num_rt_args), rbx->as_VMReg());
map->set_callee_saved(VMRegImpl::stack2reg(rsi_off + num_rt_args), rsi->as_VMReg());
map->set_callee_saved(VMRegImpl::stack2reg(rdi_off + num_rt_args), rdi->as_VMReg());
-#ifdef _LP64
map->set_callee_saved(VMRegImpl::stack2reg(r8_off + num_rt_args), r8->as_VMReg());
map->set_callee_saved(VMRegImpl::stack2reg(r9_off + num_rt_args), r9->as_VMReg());
map->set_callee_saved(VMRegImpl::stack2reg(r10_off + num_rt_args), r10->as_VMReg());
@@ -369,52 +330,23 @@ static OopMap* generate_oop_map(StubAssembler* sasm, int num_rt_args,
map->set_callee_saved(VMRegImpl::stack2reg(r13H_off + num_rt_args), r13->as_VMReg()->next());
map->set_callee_saved(VMRegImpl::stack2reg(r14H_off + num_rt_args), r14->as_VMReg()->next());
map->set_callee_saved(VMRegImpl::stack2reg(r15H_off + num_rt_args), r15->as_VMReg()->next());
-#endif // _LP64
int xmm_bypass_limit = FrameMap::get_num_caller_save_xmms();
if (save_fpu_registers) {
-#ifndef _LP64
- if (UseSSE < 2) {
- int fpu_off = float_regs_as_doubles_off;
- for (int n = 0; n < FrameMap::nof_fpu_regs; n++) {
- VMReg fpu_name_0 = FrameMap::fpu_regname(n);
- map->set_callee_saved(VMRegImpl::stack2reg(fpu_off + num_rt_args), fpu_name_0);
+ int xmm_off = xmm_regs_as_doubles_off;
+ for (int n = 0; n < FrameMap::nof_xmm_regs; n++) {
+ if (n < xmm_bypass_limit) {
+ VMReg xmm_name_0 = as_XMMRegister(n)->as_VMReg();
+ map->set_callee_saved(VMRegImpl::stack2reg(xmm_off + num_rt_args), xmm_name_0);
// %%% This is really a waste but we'll keep things as they were for now
if (true) {
- map->set_callee_saved(VMRegImpl::stack2reg(fpu_off + 1 + num_rt_args), fpu_name_0->next());
- }
- fpu_off += 2;
- }
- assert(fpu_off == fpu_state_off, "incorrect number of fpu stack slots");
-
- if (UseSSE == 1) {
- int xmm_off = xmm_regs_as_doubles_off;
- for (int n = 0; n < FrameMap::nof_fpu_regs; n++) {
- VMReg xmm_name_0 = as_XMMRegister(n)->as_VMReg();
- map->set_callee_saved(VMRegImpl::stack2reg(xmm_off + num_rt_args), xmm_name_0);
- xmm_off += 2;
+ map->set_callee_saved(VMRegImpl::stack2reg(xmm_off + 1 + num_rt_args), xmm_name_0->next());
}
- assert(xmm_off == float_regs_as_doubles_off, "incorrect number of xmm registers");
}
+ xmm_off += 2;
}
-#endif // !LP64
-
- if (UseSSE >= 2) {
- int xmm_off = xmm_regs_as_doubles_off;
- for (int n = 0; n < FrameMap::nof_xmm_regs; n++) {
- if (n < xmm_bypass_limit) {
- VMReg xmm_name_0 = as_XMMRegister(n)->as_VMReg();
- map->set_callee_saved(VMRegImpl::stack2reg(xmm_off + num_rt_args), xmm_name_0);
- // %%% This is really a waste but we'll keep things as they were for now
- if (true) {
- map->set_callee_saved(VMRegImpl::stack2reg(xmm_off + 1 + num_rt_args), xmm_name_0->next());
- }
- }
- xmm_off += 2;
- }
- assert(xmm_off == float_regs_as_doubles_off, "incorrect number of xmm registers");
- }
+ assert(xmm_off == float_regs_as_doubles_off, "incorrect number of xmm registers");
}
return map;
@@ -426,14 +358,7 @@ void C1_MacroAssembler::save_live_registers_no_oop_map(bool save_fpu_registers)
__ block_comment("save_live_registers");
// Push CPU state in multiple of 16 bytes
-#ifdef _LP64
__ save_legacy_gprs();
-#else
- __ pusha();
-#endif
-
- // assert(float_regs_as_doubles_off % 2 == 0, "misaligned offset");
- // assert(xmm_regs_as_doubles_off % 2 == 0, "misaligned offset");
__ subptr(rsp, extra_space_offset * VMRegImpl::stack_slot_size);
@@ -442,71 +367,25 @@ void C1_MacroAssembler::save_live_registers_no_oop_map(bool save_fpu_registers)
#endif
if (save_fpu_registers) {
-#ifndef _LP64
- if (UseSSE < 2) {
- // save FPU stack
- __ fnsave(Address(rsp, fpu_state_off * VMRegImpl::stack_slot_size));
- __ fwait();
-
-#ifdef ASSERT
- Label ok;
- __ cmpw(Address(rsp, fpu_state_off * VMRegImpl::stack_slot_size), StubRoutines::x86::fpu_cntrl_wrd_std());
- __ jccb(Assembler::equal, ok);
- __ stop("corrupted control word detected");
- __ bind(ok);
-#endif
-
- // Reset the control word to guard against exceptions being unmasked
- // since fstp_d can cause FPU stack underflow exceptions. Write it
- // into the on stack copy and then reload that to make sure that the
- // current and future values are correct.
- __ movw(Address(rsp, fpu_state_off * VMRegImpl::stack_slot_size), StubRoutines::x86::fpu_cntrl_wrd_std());
- __ frstor(Address(rsp, fpu_state_off * VMRegImpl::stack_slot_size));
-
- // Save the FPU registers in de-opt-able form
- int offset = 0;
- for (int n = 0; n < FrameMap::nof_fpu_regs; n++) {
- __ fstp_d(Address(rsp, float_regs_as_doubles_off * VMRegImpl::stack_slot_size + offset));
- offset += 8;
- }
-
- if (UseSSE == 1) {
- // save XMM registers as float because double not supported without SSE2(num MMX == num fpu)
- int offset = 0;
- for (int n = 0; n < FrameMap::nof_fpu_regs; n++) {
- XMMRegister xmm_name = as_XMMRegister(n);
- __ movflt(Address(rsp, xmm_regs_as_doubles_off * VMRegImpl::stack_slot_size + offset), xmm_name);
- offset += 8;
- }
- }
- }
-#endif // !_LP64
-
- if (UseSSE >= 2) {
- // save XMM registers
- // XMM registers can contain float or double values, but this is not known here,
- // so always save them as doubles.
- // note that float values are _not_ converted automatically, so for float values
- // the second word contains only garbage data.
- int xmm_bypass_limit = FrameMap::get_num_caller_save_xmms();
- int offset = 0;
- for (int n = 0; n < xmm_bypass_limit; n++) {
- XMMRegister xmm_name = as_XMMRegister(n);
- __ movdbl(Address(rsp, xmm_regs_as_doubles_off * VMRegImpl::stack_slot_size + offset), xmm_name);
- offset += 8;
- }
+ // save XMM registers
+ // XMM registers can contain float or double values, but this is not known here,
+ // so always save them as doubles.
+ // note that float values are _not_ converted automatically, so for float values
+ // the second word contains only garbage data.
+ int xmm_bypass_limit = FrameMap::get_num_caller_save_xmms();
+ int offset = 0;
+ for (int n = 0; n < xmm_bypass_limit; n++) {
+ XMMRegister xmm_name = as_XMMRegister(n);
+ __ movdbl(Address(rsp, xmm_regs_as_doubles_off * VMRegImpl::stack_slot_size + offset), xmm_name);
+ offset += 8;
}
}
-
- // FPU stack must be empty now
- NOT_LP64( __ verify_FPU(0, "save_live_registers"); )
}
#undef __
#define __ sasm->
static void restore_fpu(C1_MacroAssembler* sasm, bool restore_fpu_registers) {
-#ifdef _LP64
if (restore_fpu_registers) {
// restore XMM registers
int xmm_bypass_limit = FrameMap::get_num_caller_save_xmms();
@@ -517,38 +396,6 @@ static void restore_fpu(C1_MacroAssembler* sasm, bool restore_fpu_registers) {
offset += 8;
}
}
-#else
- if (restore_fpu_registers) {
- if (UseSSE >= 2) {
- // restore XMM registers
- int xmm_bypass_limit = FrameMap::nof_xmm_regs;
- int offset = 0;
- for (int n = 0; n < xmm_bypass_limit; n++) {
- XMMRegister xmm_name = as_XMMRegister(n);
- __ movdbl(xmm_name, Address(rsp, xmm_regs_as_doubles_off * VMRegImpl::stack_slot_size + offset));
- offset += 8;
- }
- } else if (UseSSE == 1) {
- // restore XMM registers(num MMX == num fpu)
- int offset = 0;
- for (int n = 0; n < FrameMap::nof_fpu_regs; n++) {
- XMMRegister xmm_name = as_XMMRegister(n);
- __ movflt(xmm_name, Address(rsp, xmm_regs_as_doubles_off * VMRegImpl::stack_slot_size + offset));
- offset += 8;
- }
- }
-
- if (UseSSE < 2) {
- __ frstor(Address(rsp, fpu_state_off * VMRegImpl::stack_slot_size));
- } else {
- // check that FPU stack is really empty
- __ verify_FPU(0, "restore_live_registers");
- }
- } else {
- // check that FPU stack is really empty
- __ verify_FPU(0, "restore_live_registers");
- }
-#endif // _LP64
#ifdef ASSERT
{
@@ -570,12 +417,7 @@ void C1_MacroAssembler::restore_live_registers(bool restore_fpu_registers) {
__ block_comment("restore_live_registers");
restore_fpu(this, restore_fpu_registers);
-#ifdef _LP64
__ restore_legacy_gprs();
-#else
- __ popa();
-#endif
-
}
@@ -584,7 +426,6 @@ void C1_MacroAssembler::restore_live_registers_except_rax(bool restore_fpu_regis
restore_fpu(this, restore_fpu_registers);
-#ifdef _LP64
__ movptr(r15, Address(rsp, 0));
__ movptr(r14, Address(rsp, wordSize));
__ movptr(r13, Address(rsp, 2 * wordSize));
@@ -602,17 +443,6 @@ void C1_MacroAssembler::restore_live_registers_except_rax(bool restore_fpu_regis
__ movptr(rcx, Address(rsp, 14 * wordSize));
__ addptr(rsp, 16 * wordSize);
-#else
-
- __ pop(rdi);
- __ pop(rsi);
- __ pop(rbp);
- __ pop(rbx); // skip this value
- __ pop(rbx);
- __ pop(rdx);
- __ pop(rcx);
- __ addptr(rsp, BytesPerWord);
-#endif // _LP64
}
#undef __
@@ -639,12 +469,7 @@ void Runtime1::initialize_pd() {
// return: offset in 64-bit words.
uint Runtime1::runtime_blob_current_thread_offset(frame f) {
-#ifdef _LP64
return r15_off / 2; // rsp offsets are in halfwords
-#else
- Unimplemented();
- return 0;
-#endif
}
// Target: the entry point of the method that creates and posts the exception oop.
@@ -664,15 +489,8 @@ OopMapSet* Runtime1::generate_exception_throw(StubAssembler* sasm, address targe
// Load arguments for exception that are passed as arguments into the stub.
if (has_argument) {
-#ifdef _LP64
__ movptr(c_rarg1, Address(rbp, 2*BytesPerWord));
__ movptr(c_rarg2, Address(rbp, 3*BytesPerWord));
-#else
- __ movptr(temp_reg, Address(rbp, 3*BytesPerWord));
- __ push(temp_reg);
- __ movptr(temp_reg, Address(rbp, 2*BytesPerWord));
- __ push(temp_reg);
-#endif // _LP64
}
int call_offset = __ call_RT(noreg, noreg, target, num_rt_args - 1);
@@ -692,7 +510,7 @@ OopMapSet* Runtime1::generate_handle_exception(C1StubId id, StubAssembler *sasm)
const Register exception_oop = rax;
const Register exception_pc = rdx;
// other registers used in this stub
- const Register thread = NOT_LP64(rdi) LP64_ONLY(r15_thread);
+ const Register thread = r15_thread;
// Save registers, if required.
OopMapSet* oop_maps = new OopMapSet();
@@ -714,8 +532,8 @@ OopMapSet* Runtime1::generate_handle_exception(C1StubId id, StubAssembler *sasm)
__ movptr(exception_pc, Address(rbp, 1*BytesPerWord));
// make sure that the vm_results are cleared (may be unnecessary)
- __ movptr(Address(thread, JavaThread::vm_result_offset()), NULL_WORD);
- __ movptr(Address(thread, JavaThread::vm_result_2_offset()), NULL_WORD);
+ __ movptr(Address(thread, JavaThread::vm_result_oop_offset()), NULL_WORD);
+ __ movptr(Address(thread, JavaThread::vm_result_metadata_offset()), NULL_WORD);
break;
case C1StubId::handle_exception_nofpu_id:
case C1StubId::handle_exception_id:
@@ -725,7 +543,7 @@ OopMapSet* Runtime1::generate_handle_exception(C1StubId id, StubAssembler *sasm)
case C1StubId::handle_exception_from_callee_id: {
// At this point all registers except exception oop (RAX) and
// exception pc (RDX) are dead.
- const int frame_size = 2 /*BP, return address*/ NOT_LP64(+ 1 /*thread*/) WIN64_ONLY(+ frame::arg_reg_save_area_bytes / BytesPerWord);
+ const int frame_size = 2 /*BP, return address*/ WIN64_ONLY(+ frame::arg_reg_save_area_bytes / BytesPerWord);
oop_map = new OopMap(frame_size * VMRegImpl::slots_per_word, 0);
sasm->set_frame_size(frame_size);
WIN64_ONLY(__ subq(rsp, frame::arg_reg_save_area_bytes));
@@ -734,21 +552,11 @@ OopMapSet* Runtime1::generate_handle_exception(C1StubId id, StubAssembler *sasm)
default: ShouldNotReachHere();
}
-#if !defined(_LP64) && defined(COMPILER2)
- if (UseSSE < 2 && !CompilerConfig::is_c1_only_no_jvmci()) {
- // C2 can leave the fpu stack dirty
- __ empty_FPU_stack();
- }
-#endif // !_LP64 && COMPILER2
-
// verify that only rax, and rdx is valid at this time
__ invalidate_registers(false, true, true, false, true, true);
// verify that rax, contains a valid exception
__ verify_not_null_oop(exception_oop);
- // load address of JavaThread object for thread-local data
- NOT_LP64(__ get_thread(thread);)
-
#ifdef ASSERT
// check that fields in JavaThread for exception oop and issuing pc are
// empty before writing to them
@@ -815,11 +623,11 @@ void Runtime1::generate_unwind_exception(StubAssembler *sasm) {
// incoming parameters
const Register exception_oop = rax;
// callee-saved copy of exception_oop during runtime call
- const Register exception_oop_callee_saved = NOT_LP64(rsi) LP64_ONLY(r14);
+ const Register exception_oop_callee_saved = r14;
// other registers used in this stub
const Register exception_pc = rdx;
const Register handler_addr = rbx;
- const Register thread = NOT_LP64(rdi) LP64_ONLY(r15_thread);
+ const Register thread = r15_thread;
if (AbortVMOnException) {
__ enter();
@@ -834,7 +642,6 @@ void Runtime1::generate_unwind_exception(StubAssembler *sasm) {
#ifdef ASSERT
// check that fields in JavaThread for exception oop and issuing pc are empty
- NOT_LP64(__ get_thread(thread);)
Label oop_empty;
__ cmpptr(Address(thread, JavaThread::exception_oop_offset()), 0);
__ jcc(Assembler::equal, oop_empty);
@@ -848,14 +655,10 @@ void Runtime1::generate_unwind_exception(StubAssembler *sasm) {
__ bind(pc_empty);
#endif
- // clear the FPU stack in case any FPU results are left behind
- NOT_LP64( __ empty_FPU_stack(); )
-
// save exception_oop in callee-saved register to preserve it during runtime calls
__ verify_not_null_oop(exception_oop);
__ movptr(exception_oop_callee_saved, exception_oop);
- NOT_LP64(__ get_thread(thread);)
// Get return address (is on top of stack after leave).
__ movptr(exception_pc, Address(rsp, 0));
@@ -905,19 +708,10 @@ OopMapSet* Runtime1::generate_patching(StubAssembler* sasm, address target) {
OopMap* oop_map = save_live_registers(sasm, num_rt_args);
-#ifdef _LP64
const Register thread = r15_thread;
// No need to worry about dummy
__ mov(c_rarg0, thread);
-#else
- __ push(rax); // push dummy
-
- const Register thread = rdi; // is callee-saved register (Visual C++ calling conventions)
- // push java thread (becomes first argument of C function)
- __ get_thread(thread);
- __ push(thread);
-#endif // _LP64
- __ set_last_Java_frame(thread, noreg, rbp, nullptr, rscratch1);
+ __ set_last_Java_frame(noreg, rbp, nullptr, rscratch1);
// do the call
__ call(RuntimeAddress(target));
OopMapSet* oop_maps = new OopMapSet();
@@ -927,7 +721,7 @@ OopMapSet* Runtime1::generate_patching(StubAssembler* sasm, address target) {
guarantee(thread != rax, "change this code");
__ push(rax);
{ Label L;
- __ get_thread(rax);
+ __ get_thread_slow(rax);
__ cmpptr(thread, rax);
__ jcc(Assembler::equal, L);
__ stop("StubAssembler::call_RT: rdi/r15 not callee saved?");
@@ -935,11 +729,7 @@ OopMapSet* Runtime1::generate_patching(StubAssembler* sasm, address target) {
}
__ pop(rax);
#endif
- __ reset_last_Java_frame(thread, true);
-#ifndef _LP64
- __ pop(rcx); // discard thread arg
- __ pop(rcx); // discard dummy
-#endif // _LP64
+ __ reset_last_Java_frame(true);
// check for pending exceptions
{ Label L;
@@ -1166,15 +956,8 @@ OopMapSet* Runtime1::generate_code_for(C1StubId id, StubAssembler* sasm) {
// This is called via call_runtime so the arguments
// will be place in C abi locations
-#ifdef _LP64
__ verify_oop(c_rarg0);
__ mov(rax, c_rarg0);
-#else
- // The object is passed on the stack and we haven't pushed a
- // frame yet so it's one work away from top of stack.
- __ movptr(rax, Address(rsp, 1 * BytesPerWord));
- __ verify_oop(rax);
-#endif // _LP64
// load the klass and check the has finalizer flag
Label register_finalizer;
@@ -1467,9 +1250,8 @@ OopMapSet* Runtime1::generate_code_for(C1StubId id, StubAssembler* sasm) {
// the live registers get saved.
save_live_registers(sasm, 1);
- __ NOT_LP64(push(rax)) LP64_ONLY(mov(c_rarg0, rax));
+ __ mov(c_rarg0, rax);
__ call(RuntimeAddress(CAST_FROM_FN_PTR(address, static_cast(SharedRuntime::dtrace_object_alloc))));
- NOT_LP64(__ pop(rax));
restore_live_registers(sasm);
}
@@ -1477,7 +1259,6 @@ OopMapSet* Runtime1::generate_code_for(C1StubId id, StubAssembler* sasm) {
case C1StubId::fpu2long_stub_id:
{
-#ifdef _LP64
Label done;
__ cvttsd2siq(rax, Address(rsp, wordSize));
__ cmp64(rax, ExternalAddress((address) StubRoutines::x86::double_sign_flip()));
@@ -1489,78 +1270,6 @@ OopMapSet* Runtime1::generate_code_for(C1StubId id, StubAssembler* sasm) {
__ pop(rax);
__ bind(done);
__ ret(0);
-#else
- // rax, and rdx are destroyed, but should be free since the result is returned there
- // preserve rsi,ecx
- __ push(rsi);
- __ push(rcx);
-
- // check for NaN
- Label return0, do_return, return_min_jlong, do_convert;
-
- Address value_high_word(rsp, wordSize + 4);
- Address value_low_word(rsp, wordSize);
- Address result_high_word(rsp, 3*wordSize + 4);
- Address result_low_word(rsp, 3*wordSize);
-
- __ subptr(rsp, 32); // more than enough on 32bit
- __ fst_d(value_low_word);
- __ movl(rax, value_high_word);
- __ andl(rax, 0x7ff00000);
- __ cmpl(rax, 0x7ff00000);
- __ jcc(Assembler::notEqual, do_convert);
- __ movl(rax, value_high_word);
- __ andl(rax, 0xfffff);
- __ orl(rax, value_low_word);
- __ jcc(Assembler::notZero, return0);
-
- __ bind(do_convert);
- __ fnstcw(Address(rsp, 0));
- __ movzwl(rax, Address(rsp, 0));
- __ orl(rax, 0xc00);
- __ movw(Address(rsp, 2), rax);
- __ fldcw(Address(rsp, 2));
- __ fwait();
- __ fistp_d(result_low_word);
- __ fldcw(Address(rsp, 0));
- __ fwait();
- // This gets the entire long in rax on 64bit
- __ movptr(rax, result_low_word);
- // testing of high bits
- __ movl(rdx, result_high_word);
- __ mov(rcx, rax);
- // What the heck is the point of the next instruction???
- __ xorl(rcx, 0x0);
- __ movl(rsi, 0x80000000);
- __ xorl(rsi, rdx);
- __ orl(rcx, rsi);
- __ jcc(Assembler::notEqual, do_return);
- __ fldz();
- __ fcomp_d(value_low_word);
- __ fnstsw_ax();
- __ sahf();
- __ jcc(Assembler::above, return_min_jlong);
- // return max_jlong
- __ movl(rdx, 0x7fffffff);
- __ movl(rax, 0xffffffff);
- __ jmp(do_return);
-
- __ bind(return_min_jlong);
- __ movl(rdx, 0x80000000);
- __ xorl(rax, rax);
- __ jmp(do_return);
-
- __ bind(return0);
- __ fpop();
- __ xorptr(rdx,rdx);
- __ xorptr(rax,rax);
-
- __ bind(do_return);
- __ addptr(rsp, 32);
- __ pop(rcx);
- __ pop(rsi);
- __ ret(0);
-#endif // _LP64
}
break;
diff --git a/src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp b/src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp
index 83ecdee52199b..b4f8e9d95147d 100644
--- a/src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp
+++ b/src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp
@@ -43,22 +43,8 @@ void C2SafepointPollStub::emit(C2_MacroAssembler& masm) {
__ bind(entry());
InternalAddress safepoint_pc(masm.pc() - masm.offset() + _safepoint_offset);
-#ifdef _LP64
__ lea(rscratch1, safepoint_pc);
__ movptr(Address(r15_thread, JavaThread::saved_exception_pc_offset()), rscratch1);
-#else
- const Register tmp1 = rcx;
- const Register tmp2 = rdx;
- __ push(tmp1);
- __ push(tmp2);
-
- __ lea(tmp1, safepoint_pc);
- __ get_thread(tmp2);
- __ movptr(Address(tmp2, JavaThread::saved_exception_pc_offset()), tmp1);
-
- __ pop(tmp2);
- __ pop(tmp1);
-#endif
__ jump(callback_addr);
}
diff --git a/src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp b/src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp
index b6d513f50f288..177be6e59f74a 100644
--- a/src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp
+++ b/src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp
@@ -107,16 +107,6 @@ void C2_MacroAssembler::verified_entry(int framesize, int stack_bang_size, bool
movptr(Address(rsp, framesize), (int32_t)0xbadb100d);
}
-#ifndef _LP64
- // If method sets FPU control word do it now
- if (fp_mode_24b) {
- fldcw(ExternalAddress(StubRoutines::x86::addr_fpu_cntrl_wrd_24()));
- }
- if (UseSSE >= 2 && VerifyFPU) {
- verify_FPU(0, "FPU stack must be clean on entry");
- }
-#endif
-
#ifdef ASSERT
if (VerifyStackAtCalls) {
Label L;
@@ -133,7 +123,6 @@ void C2_MacroAssembler::verified_entry(int framesize, int stack_bang_size, bool
if (!is_stub) {
BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler();
- #ifdef _LP64
// We put the non-hot code of the nmethod entry barrier out-of-line in a stub.
Label dummy_slow_path;
Label dummy_continuation;
@@ -147,10 +136,6 @@ void C2_MacroAssembler::verified_entry(int framesize, int stack_bang_size, bool
continuation = &stub->continuation();
}
bs->nmethod_entry_barrier(this, slow_path, continuation);
-#else
- // Don't bother with out-of-line nmethod entry barrier stub for x86_32.
- bs->nmethod_entry_barrier(this, nullptr /* slow_path */, nullptr /* continuation */);
-#endif
}
}
@@ -299,7 +284,7 @@ void C2_MacroAssembler::fast_lock(Register objReg, Register boxReg, Register tmp
// Locked by current thread if difference with current SP is less than one page.
subptr(tmpReg, rsp);
// Next instruction set ZFlag == 1 (Success) if difference is less then one page.
- andptr(tmpReg, (int32_t) (NOT_LP64(0xFFFFF003) LP64_ONLY(7 - (int)os::vm_page_size())) );
+ andptr(tmpReg, (int32_t) (7 - (int)os::vm_page_size()) );
movptr(Address(boxReg, 0), tmpReg);
}
jmp(DONE_LABEL);
@@ -307,10 +292,6 @@ void C2_MacroAssembler::fast_lock(Register objReg, Register boxReg, Register tmp
bind(IsInflated);
// The object is inflated. tmpReg contains pointer to ObjectMonitor* + markWord::monitor_value
-#ifndef _LP64
- // Just take slow path to avoid dealing with 64 bit atomic instructions here.
- orl(boxReg, 1); // set ICC.ZF=0 to indicate failure
-#else
// Unconditionally set box->_displaced_header = markWord::unused_mark().
// Without cast to int32_t this style of movptr will destroy r10 which is typically obj.
movptr(Address(boxReg, 0), checked_cast(markWord::unused_mark().value()));
@@ -329,7 +310,6 @@ void C2_MacroAssembler::fast_lock(Register objReg, Register boxReg, Register tmp
jccb(Assembler::notEqual, NO_COUNT); // If not recursive, ZF = 0 at this point (fail)
incq(Address(scrReg, OM_OFFSET_NO_MONITOR_VALUE_TAG(recursions)));
xorq(rax, rax); // Set ZF = 1 (success) for recursive lock, denoting locking success
-#endif // _LP64
bind(DONE_LABEL);
// ZFlag == 1 count in fast path
@@ -338,10 +318,8 @@ void C2_MacroAssembler::fast_lock(Register objReg, Register boxReg, Register tmp
bind(COUNT);
if (LockingMode == LM_LEGACY) {
-#ifdef _LP64
// Count monitors in fast path
increment(Address(thread, JavaThread::held_monitor_count_offset()));
-#endif
}
xorl(tmpReg, tmpReg); // Set ZF == 1
@@ -404,11 +382,6 @@ void C2_MacroAssembler::fast_unlock(Register objReg, Register boxReg, Register t
// It's inflated.
-#ifndef _LP64
- // Just take slow path to avoid dealing with 64 bit atomic instructions here.
- orl(boxReg, 1); // set ICC.ZF=0 to indicate failure
- jmpb(DONE_LABEL);
-#else
// Despite our balanced locking property we still check that m->_owner == Self
// as java routines or native JNI code called by this thread might
// have released the lock.
@@ -462,7 +435,6 @@ void C2_MacroAssembler::fast_unlock(Register objReg, Register boxReg, Register t
bind (LSuccess);
testl (boxReg, 0); // set ICC.ZF=1 to indicate success
jmpb (DONE_LABEL);
-#endif // _LP64
if (LockingMode == LM_LEGACY) {
bind (Stacked);
@@ -482,9 +454,7 @@ void C2_MacroAssembler::fast_unlock(Register objReg, Register boxReg, Register t
if (LockingMode == LM_LEGACY) {
// Count monitors in fast path
-#ifdef _LP64
decrementq(Address(r15_thread, JavaThread::held_monitor_count_offset()));
-#endif
}
xorl(tmpReg, tmpReg); // Set ZF == 1
@@ -506,7 +476,7 @@ void C2_MacroAssembler::fast_lock_lightweight(Register obj, Register box, Regist
Label slow_path;
if (UseObjectMonitorTable) {
- // Clear cache in case fast locking succeeds.
+ // Clear cache in case fast locking succeeds or we need to take the slow-path.
movptr(Address(box, BasicLock::object_monitor_cache_offset_in_bytes()), 0);
}
@@ -563,11 +533,6 @@ void C2_MacroAssembler::fast_lock_lightweight(Register obj, Register box, Regist
{ // Handle inflated monitor.
bind(inflated);
-#ifndef _LP64
- // Just take slow path to avoid dealing with 64 bit atomic instructions here.
- orl(box, 1); // set ICC.ZF=0 to indicate failure
- jmpb(slow_path);
-#else
const Register monitor = t;
if (!UseObjectMonitorTable) {
@@ -633,7 +598,6 @@ void C2_MacroAssembler::fast_lock_lightweight(Register obj, Register box, Regist
increment(recursions_address);
bind(monitor_locked);
-#endif // _LP64
}
bind(locked);
@@ -746,11 +710,6 @@ void C2_MacroAssembler::fast_unlock_lightweight(Register obj, Register reg_rax,
bind(inflated);
-#ifndef _LP64
- // Just take slow path to avoid dealing with 64 bit atomic instructions here.
- orl(t, 1); // set ICC.ZF=0 to indicate failure
- jmpb(slow_path);
-#else
if (!UseObjectMonitorTable) {
assert(mark == monitor, "should be the same here");
} else {
@@ -800,7 +759,6 @@ void C2_MacroAssembler::fast_unlock_lightweight(Register obj, Register reg_rax,
// Recursive unlock.
bind(recursive);
decrement(recursions_address);
-#endif // _LP64
}
bind(unlocked);
@@ -829,6 +787,119 @@ void C2_MacroAssembler::fast_unlock_lightweight(Register obj, Register reg_rax,
// C2 uses the value of ZF to determine the continuation.
}
+static void abort_verify_int_in_range(uint idx, jint val, jint lo, jint hi) {
+ fatal("Invalid CastII, idx: %u, val: %d, lo: %d, hi: %d", idx, val, lo, hi);
+}
+
+static void reconstruct_frame_pointer_helper(MacroAssembler* masm, Register dst) {
+ const int framesize = Compile::current()->output()->frame_size_in_bytes();
+ masm->movptr(dst, rsp);
+ if (framesize > 2 * wordSize) {
+ masm->addptr(dst, framesize - 2 * wordSize);
+ }
+}
+
+void C2_MacroAssembler::reconstruct_frame_pointer(Register rtmp) {
+ if (PreserveFramePointer) {
+ // frame pointer is valid
+#ifdef ASSERT
+ // Verify frame pointer value in rbp.
+ reconstruct_frame_pointer_helper(this, rtmp);
+ Label L_success;
+ cmpq(rbp, rtmp);
+ jccb(Assembler::equal, L_success);
+ STOP("frame pointer mismatch");
+ bind(L_success);
+#endif // ASSERT
+ } else {
+ reconstruct_frame_pointer_helper(this, rbp);
+ }
+}
+
+void C2_MacroAssembler::verify_int_in_range(uint idx, const TypeInt* t, Register val) {
+ jint lo = t->_lo;
+ jint hi = t->_hi;
+ assert(lo < hi, "type should not be empty or constant, idx: %u, lo: %d, hi: %d", idx, lo, hi);
+ if (t == TypeInt::INT) {
+ return;
+ }
+
+ BLOCK_COMMENT("CastII {");
+ Label fail;
+ Label succeed;
+ if (hi == max_jint) {
+ cmpl(val, lo);
+ jccb(Assembler::greaterEqual, succeed);
+ } else {
+ if (lo != min_jint) {
+ cmpl(val, lo);
+ jccb(Assembler::less, fail);
+ }
+ cmpl(val, hi);
+ jccb(Assembler::lessEqual, succeed);
+ }
+
+ bind(fail);
+ movl(c_rarg0, idx);
+ movl(c_rarg1, val);
+ movl(c_rarg2, lo);
+ movl(c_rarg3, hi);
+ reconstruct_frame_pointer(rscratch1);
+ call(RuntimeAddress(CAST_FROM_FN_PTR(address, abort_verify_int_in_range)));
+ hlt();
+ bind(succeed);
+ BLOCK_COMMENT("} // CastII");
+}
+
+static void abort_verify_long_in_range(uint idx, jlong val, jlong lo, jlong hi) {
+ fatal("Invalid CastLL, idx: %u, val: " JLONG_FORMAT ", lo: " JLONG_FORMAT ", hi: " JLONG_FORMAT, idx, val, lo, hi);
+}
+
+void C2_MacroAssembler::verify_long_in_range(uint idx, const TypeLong* t, Register val, Register tmp) {
+ jlong lo = t->_lo;
+ jlong hi = t->_hi;
+ assert(lo < hi, "type should not be empty or constant, idx: %u, lo: " JLONG_FORMAT ", hi: " JLONG_FORMAT, idx, lo, hi);
+ if (t == TypeLong::LONG) {
+ return;
+ }
+
+ BLOCK_COMMENT("CastLL {");
+ Label fail;
+ Label succeed;
+
+ auto cmp_val = [&](jlong bound) {
+ if (is_simm32(bound)) {
+ cmpq(val, checked_cast(bound));
+ } else {
+ mov64(tmp, bound);
+ cmpq(val, tmp);
+ }
+ };
+
+ if (hi == max_jlong) {
+ cmp_val(lo);
+ jccb(Assembler::greaterEqual, succeed);
+ } else {
+ if (lo != min_jlong) {
+ cmp_val(lo);
+ jccb(Assembler::less, fail);
+ }
+ cmp_val(hi);
+ jccb(Assembler::lessEqual, succeed);
+ }
+
+ bind(fail);
+ movl(c_rarg0, idx);
+ movq(c_rarg1, val);
+ mov64(c_rarg2, lo);
+ mov64(c_rarg3, hi);
+ reconstruct_frame_pointer(rscratch1);
+ call(RuntimeAddress(CAST_FROM_FN_PTR(address, abort_verify_long_in_range)));
+ hlt();
+ bind(succeed);
+ BLOCK_COMMENT("} // CastLL");
+}
+
//-------------------------------------------------------------------------------------------
// Generic instructions support for use in .ad files C2 code generation
@@ -1174,7 +1245,6 @@ void C2_MacroAssembler::signum_fp(int opcode, XMMRegister dst, XMMRegister zero,
Label DONE_LABEL;
if (opcode == Op_SignumF) {
- assert(UseSSE > 0, "required");
ucomiss(dst, zero);
jcc(Assembler::equal, DONE_LABEL); // handle special case +0.0/-0.0, if argument is +0.0/-0.0, return argument
jcc(Assembler::parity, DONE_LABEL); // handle special case NaN, if argument NaN, return NaN
@@ -1182,7 +1252,6 @@ void C2_MacroAssembler::signum_fp(int opcode, XMMRegister dst, XMMRegister zero,
jcc(Assembler::above, DONE_LABEL);
xorps(dst, ExternalAddress(StubRoutines::x86::vector_float_sign_flip()), noreg);
} else if (opcode == Op_SignumD) {
- assert(UseSSE > 1, "required");
ucomisd(dst, zero);
jcc(Assembler::equal, DONE_LABEL); // handle special case +0.0/-0.0, if argument is +0.0/-0.0, return argument
jcc(Assembler::parity, DONE_LABEL); // handle special case NaN, if argument NaN, return NaN
@@ -1522,7 +1591,6 @@ void C2_MacroAssembler::vinsert(BasicType typ, XMMRegister dst, XMMRegister src,
}
}
-#ifdef _LP64
void C2_MacroAssembler::vgather8b_masked_offset(BasicType elem_bt,
XMMRegister dst, Register base,
Register idx_base,
@@ -1561,7 +1629,6 @@ void C2_MacroAssembler::vgather8b_masked_offset(BasicType elem_bt,
}
}
}
-#endif // _LP64
void C2_MacroAssembler::vgather8b_offset(BasicType elem_bt, XMMRegister dst,
Register base, Register idx_base,
@@ -1633,7 +1700,7 @@ void C2_MacroAssembler::vgather_subword(BasicType elem_ty, XMMRegister dst,
if (mask == noreg) {
vgather8b_offset(elem_ty, temp_dst, base, idx_base, offset, rtmp, vlen_enc);
} else {
- LP64_ONLY(vgather8b_masked_offset(elem_ty, temp_dst, base, idx_base, offset, mask, mask_idx, rtmp, vlen_enc));
+ vgather8b_masked_offset(elem_ty, temp_dst, base, idx_base, offset, mask, mask_idx, rtmp, vlen_enc);
}
// TEMP_PERM_VEC(temp_dst) = PERMUTE TMP_VEC_64(temp_dst) PERM_INDEX(xtmp1)
vpermd(temp_dst, xtmp1, temp_dst, vlen_enc == Assembler::AVX_512bit ? vlen_enc : Assembler::AVX_256bit);
@@ -2037,7 +2104,6 @@ void C2_MacroAssembler::reduceI(int opcode, int vlen,
}
}
-#ifdef _LP64
void C2_MacroAssembler::reduceL(int opcode, int vlen,
Register dst, Register src1, XMMRegister src2,
XMMRegister vtmp1, XMMRegister vtmp2) {
@@ -2049,7 +2115,6 @@ void C2_MacroAssembler::reduceL(int opcode, int vlen,
default: assert(false, "wrong vector length");
}
}
-#endif // _LP64
void C2_MacroAssembler::reduceF(int opcode, int vlen, XMMRegister dst, XMMRegister src, XMMRegister vtmp1, XMMRegister vtmp2) {
switch (vlen) {
@@ -2299,7 +2364,6 @@ void C2_MacroAssembler::reduce32S(int opcode, Register dst, Register src1, XMMRe
reduce16S(opcode, dst, src1, vtmp1, vtmp1, vtmp2);
}
-#ifdef _LP64
void C2_MacroAssembler::reduce2L(int opcode, Register dst, Register src1, XMMRegister src2, XMMRegister vtmp1, XMMRegister vtmp2) {
pshufd(vtmp2, src2, 0xE);
reduce_operation_128(T_LONG, opcode, vtmp2, src2);
@@ -2325,7 +2389,6 @@ void C2_MacroAssembler::genmask(KRegister dst, Register len, Register temp) {
bzhiq(temp, temp, len);
kmovql(dst, temp);
}
-#endif // _LP64
void C2_MacroAssembler::reduce2F(int opcode, XMMRegister dst, XMMRegister src, XMMRegister vtmp) {
reduce_operation_128(T_FLOAT, opcode, dst, src);
@@ -2741,7 +2804,6 @@ void C2_MacroAssembler::vpadd(BasicType elem_bt, XMMRegister dst, XMMRegister sr
}
}
-#ifdef _LP64
void C2_MacroAssembler::vpbroadcast(BasicType elem_bt, XMMRegister dst, Register src, int vlen_enc) {
assert(UseAVX >= 2, "required");
bool is_bw = ((elem_bt == T_BYTE) || (elem_bt == T_SHORT));
@@ -2770,7 +2832,6 @@ void C2_MacroAssembler::vpbroadcast(BasicType elem_bt, XMMRegister dst, Register
}
}
}
-#endif
void C2_MacroAssembler::vconvert_b2x(BasicType to_elem_bt, XMMRegister dst, XMMRegister src, int vlen_enc) {
switch (to_elem_bt) {
@@ -3698,7 +3759,7 @@ void C2_MacroAssembler::string_compare(Register str1, Register str2,
XMMRegister vec1, int ae, KRegister mask) {
ShortBranchVerifier sbv(this);
Label LENGTH_DIFF_LABEL, POP_LABEL, DONE_LABEL, WHILE_HEAD_LABEL;
- Label COMPARE_WIDE_VECTORS_LOOP_FAILED; // used only _LP64 && AVX3
+ Label COMPARE_WIDE_VECTORS_LOOP_FAILED; // used only AVX3
int stride, stride2, adr_stride, adr_stride1, adr_stride2;
int stride2x2 = 0x40;
Address::ScaleFactor scale = Address::no_scale;
@@ -3768,7 +3829,7 @@ void C2_MacroAssembler::string_compare(Register str1, Register str2,
Label COMPARE_WIDE_VECTORS_LOOP, COMPARE_16_CHARS, COMPARE_INDEX_CHAR;
Label COMPARE_WIDE_VECTORS_LOOP_AVX2;
Label COMPARE_TAIL_LONG;
- Label COMPARE_WIDE_VECTORS_LOOP_AVX3; // used only _LP64 && AVX3
+ Label COMPARE_WIDE_VECTORS_LOOP_AVX3; // used only AVX3
int pcmpmask = 0x19;
if (ae == StrIntrinsicNode::LL) {
@@ -3838,7 +3899,6 @@ void C2_MacroAssembler::string_compare(Register str1, Register str2,
// In a loop, compare 16-chars (32-bytes) at once using (vpxor+vptest)
bind(COMPARE_WIDE_VECTORS_LOOP);
-#ifdef _LP64
if ((AVX3Threshold == 0) && VM_Version::supports_avx512vlbw()) { // trying 64 bytes fast loop
cmpl(cnt2, stride2x2);
jccb(Assembler::below, COMPARE_WIDE_VECTORS_LOOP_AVX2);
@@ -3862,8 +3922,6 @@ void C2_MacroAssembler::string_compare(Register str1, Register str2,
vpxor(vec1, vec1);
jmpb(COMPARE_WIDE_TAIL);
}//if (VM_Version::supports_avx512vlbw())
-#endif // _LP64
-
bind(COMPARE_WIDE_VECTORS_LOOP_AVX2);
if (ae == StrIntrinsicNode::LL || ae == StrIntrinsicNode::UU) {
@@ -4032,7 +4090,6 @@ void C2_MacroAssembler::string_compare(Register str1, Register str2,
}
jmpb(DONE_LABEL);
-#ifdef _LP64
if (VM_Version::supports_avx512vlbw()) {
bind(COMPARE_WIDE_VECTORS_LOOP_FAILED);
@@ -4058,7 +4115,6 @@ void C2_MacroAssembler::string_compare(Register str1, Register str2,
subl(result, cnt1);
jmpb(POP_LABEL);
}//if (VM_Version::supports_avx512vlbw())
-#endif // _LP64
// Discard the stored length difference
bind(POP_LABEL);
@@ -4133,7 +4189,6 @@ void C2_MacroAssembler::count_positives(Register ary1, Register len,
// check the tail for absense of negatives
// ~(~0 << len) applied up to two times (for 32-bit scenario)
-#ifdef _LP64
{
Register tmp3_aliased = len;
mov64(tmp3_aliased, 0xFFFFFFFFFFFFFFFF);
@@ -4141,33 +4196,7 @@ void C2_MacroAssembler::count_positives(Register ary1, Register len,
notq(tmp3_aliased);
kmovql(mask2, tmp3_aliased);
}
-#else
- Label k_init;
- jmp(k_init);
-
- // We could not read 64-bits from a general purpose register thus we move
- // data required to compose 64 1's to the instruction stream
- // We emit 64 byte wide series of elements from 0..63 which later on would
- // be used as a compare targets with tail count contained in tmp1 register.
- // Result would be a k register having tmp1 consecutive number or 1
- // counting from least significant bit.
- address tmp = pc();
- emit_int64(0x0706050403020100);
- emit_int64(0x0F0E0D0C0B0A0908);
- emit_int64(0x1716151413121110);
- emit_int64(0x1F1E1D1C1B1A1918);
- emit_int64(0x2726252423222120);
- emit_int64(0x2F2E2D2C2B2A2928);
- emit_int64(0x3736353433323130);
- emit_int64(0x3F3E3D3C3B3A3938);
-
- bind(k_init);
- lea(len, InternalAddress(tmp));
- // create mask to test for negative byte inside a vector
- evpbroadcastb(vec1, tmp1, Assembler::AVX_512bit);
- evpcmpgtb(mask2, vec1, Address(len, 0), Assembler::AVX_512bit);
-#endif
evpcmpgtb(mask1, mask2, vec2, Address(ary1, 0), Assembler::AVX_512bit);
ktestq(mask1, mask2);
jcc(Assembler::zero, DONE);
@@ -4190,7 +4219,7 @@ void C2_MacroAssembler::count_positives(Register ary1, Register len,
// Fallthru to tail compare
} else {
- if (UseAVX >= 2 && UseSSE >= 2) {
+ if (UseAVX >= 2) {
// With AVX2, use 32-byte vector compare
Label COMPARE_WIDE_VECTORS, BREAK_LOOP;
@@ -4337,7 +4366,7 @@ void C2_MacroAssembler::count_positives(Register ary1, Register len,
// That's it
bind(DONE);
- if (UseAVX >= 2 && UseSSE >= 2) {
+ if (UseAVX >= 2) {
// clean upper bits of YMM registers
vpxor(vec1, vec1);
vpxor(vec2, vec2);
@@ -4414,7 +4443,6 @@ void C2_MacroAssembler::arrays_equals(bool is_array_equ, Register ary1, Register
lea(ary2, Address(ary2, limit, Address::times_1));
negptr(limit);
-#ifdef _LP64
if ((AVX3Threshold == 0) && VM_Version::supports_avx512vlbw()) { // trying 64 bytes fast loop
Label COMPARE_WIDE_VECTORS_LOOP_AVX2, COMPARE_WIDE_VECTORS_LOOP_AVX3;
@@ -4451,7 +4479,7 @@ void C2_MacroAssembler::arrays_equals(bool is_array_equ, Register ary1, Register
bind(COMPARE_WIDE_VECTORS_LOOP_AVX2);
}//if (VM_Version::supports_avx512vlbw())
-#endif //_LP64
+
bind(COMPARE_WIDE_VECTORS);
vmovdqu(vec1, Address(ary1, limit, scaleFactor));
if (expand_ary2) {
@@ -4618,8 +4646,6 @@ void C2_MacroAssembler::arrays_equals(bool is_array_equ, Register ary1, Register
}
}
-#ifdef _LP64
-
static void convertF2I_slowpath(C2_MacroAssembler& masm, C2GeneralStub& stub) {
#define __ masm.
Register dst = stub.data<0>();
@@ -4666,8 +4692,6 @@ void C2_MacroAssembler::convertF2I(BasicType dst_bt, BasicType src_bt, Register
bind(stub->continuation());
}
-#endif // _LP64
-
void C2_MacroAssembler::evmasked_op(int ideal_opc, BasicType eType, KRegister mask, XMMRegister dst,
XMMRegister src1, int imm8, bool merge, int vlen_enc) {
switch(ideal_opc) {
@@ -5327,7 +5351,6 @@ void C2_MacroAssembler::vector_castD2X_evex(BasicType to_elem_bt, XMMRegister ds
}
}
-#ifdef _LP64
void C2_MacroAssembler::vector_round_double_evex(XMMRegister dst, XMMRegister src,
AddressLiteral double_sign_flip, AddressLiteral new_mxcsr, int vec_enc,
Register tmp, XMMRegister xtmp1, XMMRegister xtmp2, KRegister ktmp1, KRegister ktmp2) {
@@ -5379,7 +5402,6 @@ void C2_MacroAssembler::vector_round_float_avx(XMMRegister dst, XMMRegister src,
ldmxcsr(ExternalAddress(StubRoutines::x86::addr_mxcsr_std()), tmp /*rscratch*/);
}
-#endif // _LP64
void C2_MacroAssembler::vector_unsigned_cast(XMMRegister dst, XMMRegister src, int vlen_enc,
BasicType from_elem_bt, BasicType to_elem_bt) {
@@ -5510,7 +5532,6 @@ void C2_MacroAssembler::evpternlog(XMMRegister dst, int func, KRegister mask, XM
}
}
-#ifdef _LP64
void C2_MacroAssembler::vector_long_to_maskvec(XMMRegister dst, Register src, Register rtmp1,
Register rtmp2, XMMRegister xtmp, int mask_len,
int vec_enc) {
@@ -5710,7 +5731,7 @@ void C2_MacroAssembler::vector_compress_expand_avx2(int opcode, XMMRegister dst,
// in a permute table row contains either a valid permute index or a -1 (default)
// value, this can potentially be used as a blending mask after
// compressing/expanding the source vector lanes.
- vblendvps(dst, dst, xtmp, permv, vec_enc, false, permv);
+ vblendvps(dst, dst, xtmp, permv, vec_enc, true, permv);
}
void C2_MacroAssembler::vector_compress_expand(int opcode, XMMRegister dst, XMMRegister src, KRegister mask,
@@ -5768,7 +5789,6 @@ void C2_MacroAssembler::vector_compress_expand(int opcode, XMMRegister dst, XMMR
}
}
}
-#endif
void C2_MacroAssembler::vector_signum_evex(int opcode, XMMRegister dst, XMMRegister src, XMMRegister zero, XMMRegister one,
KRegister ktmp1, int vec_enc) {
@@ -5833,10 +5853,8 @@ void C2_MacroAssembler::vector_maskall_operation(KRegister dst, Register src, in
void C2_MacroAssembler::vbroadcast(BasicType bt, XMMRegister dst, int imm32, Register rtmp, int vec_enc) {
int lane_size = type2aelembytes(bt);
- bool is_LP64 = LP64_ONLY(true) NOT_LP64(false);
- if ((is_LP64 || lane_size < 8) &&
- ((is_non_subword_integral_type(bt) && VM_Version::supports_avx512vl()) ||
- (is_subword_type(bt) && VM_Version::supports_avx512vlbw()))) {
+ if ((is_non_subword_integral_type(bt) && VM_Version::supports_avx512vl()) ||
+ (is_subword_type(bt) && VM_Version::supports_avx512vlbw())) {
movptr(rtmp, imm32);
switch(lane_size) {
case 1 : evpbroadcastb(dst, rtmp, vec_enc); break;
@@ -5848,7 +5866,7 @@ void C2_MacroAssembler::vbroadcast(BasicType bt, XMMRegister dst, int imm32, Reg
}
} else {
movptr(rtmp, imm32);
- LP64_ONLY(movq(dst, rtmp)) NOT_LP64(movdl(dst, rtmp));
+ movq(dst, rtmp);
switch(lane_size) {
case 1 : vpbroadcastb(dst, dst, vec_enc); break;
case 2 : vpbroadcastw(dst, dst, vec_enc); break;
@@ -5983,14 +6001,6 @@ void C2_MacroAssembler::vector_popcount_integral_evex(BasicType bt, XMMRegister
}
}
-#ifndef _LP64
-void C2_MacroAssembler::vector_maskall_operation32(KRegister dst, Register src, KRegister tmp, int mask_len) {
- assert(VM_Version::supports_avx512bw(), "");
- kmovdl(tmp, src);
- kunpckdql(dst, tmp, tmp);
-}
-#endif
-
// Bit reversal algorithm first reverses the bits of each byte followed by
// a byte level reversal for multi-byte primitive types (short/int/long).
// Algorithm performs a lookup table access to get reverse bit sequence
@@ -6450,7 +6460,6 @@ void C2_MacroAssembler::udivmodI(Register rax, Register divisor, Register rdx, R
bind(done);
}
-#ifdef _LP64
void C2_MacroAssembler::reverseI(Register dst, Register src, XMMRegister xtmp1,
XMMRegister xtmp2, Register rtmp) {
if(VM_Version::supports_gfni()) {
@@ -6614,7 +6623,6 @@ void C2_MacroAssembler::udivmodL(Register rax, Register divisor, Register rdx, R
subq(rdx, tmp); // remainder
bind(done);
}
-#endif
void C2_MacroAssembler::rearrange_bytes(XMMRegister dst, XMMRegister shuffle, XMMRegister src, XMMRegister xtmp1,
XMMRegister xtmp2, XMMRegister xtmp3, Register rtmp, KRegister ktmp,
@@ -7090,9 +7098,34 @@ void C2_MacroAssembler::vector_saturating_op(int ideal_opc, BasicType elem_bt, X
}
}
+void C2_MacroAssembler::evfp16ph(int opcode, XMMRegister dst, XMMRegister src1, XMMRegister src2, int vlen_enc) {
+ switch(opcode) {
+ case Op_AddVHF: evaddph(dst, src1, src2, vlen_enc); break;
+ case Op_SubVHF: evsubph(dst, src1, src2, vlen_enc); break;
+ case Op_MulVHF: evmulph(dst, src1, src2, vlen_enc); break;
+ case Op_DivVHF: evdivph(dst, src1, src2, vlen_enc); break;
+ default: assert(false, "%s", NodeClassNames[opcode]); break;
+ }
+}
+
+void C2_MacroAssembler::evfp16ph(int opcode, XMMRegister dst, XMMRegister src1, Address src2, int vlen_enc) {
+ switch(opcode) {
+ case Op_AddVHF: evaddph(dst, src1, src2, vlen_enc); break;
+ case Op_SubVHF: evsubph(dst, src1, src2, vlen_enc); break;
+ case Op_MulVHF: evmulph(dst, src1, src2, vlen_enc); break;
+ case Op_DivVHF: evdivph(dst, src1, src2, vlen_enc); break;
+ default: assert(false, "%s", NodeClassNames[opcode]); break;
+ }
+}
+
void C2_MacroAssembler::scalar_max_min_fp16(int opcode, XMMRegister dst, XMMRegister src1, XMMRegister src2,
- KRegister ktmp, XMMRegister xtmp1, XMMRegister xtmp2, int vlen_enc) {
- if (opcode == Op_MaxHF) {
+ KRegister ktmp, XMMRegister xtmp1, XMMRegister xtmp2) {
+ vector_max_min_fp16(opcode, dst, src1, src2, ktmp, xtmp1, xtmp2, Assembler::AVX_128bit);
+}
+
+void C2_MacroAssembler::vector_max_min_fp16(int opcode, XMMRegister dst, XMMRegister src1, XMMRegister src2,
+ KRegister ktmp, XMMRegister xtmp1, XMMRegister xtmp2, int vlen_enc) {
+ if (opcode == Op_MaxVHF || opcode == Op_MaxHF) {
// Move sign bits of src2 to mask register.
evpmovw2m(ktmp, src2, vlen_enc);
// xtmp1 = src2 < 0 ? src2 : src1
@@ -7104,15 +7137,15 @@ void C2_MacroAssembler::scalar_max_min_fp16(int opcode, XMMRegister dst, XMMRegi
// the second source operand is returned. If only one value is a NaN (SNaN or QNaN) for this instruction,
// the second source operand, either a NaN or a valid floating-point value, is returned
// dst = max(xtmp1, xtmp2)
- vmaxsh(dst, xtmp1, xtmp2);
+ evmaxph(dst, xtmp1, xtmp2, vlen_enc);
// isNaN = is_unordered_quiet(xtmp1)
- evcmpsh(ktmp, k0, xtmp1, xtmp1, Assembler::UNORD_Q);
+ evcmpph(ktmp, k0, xtmp1, xtmp1, Assembler::UNORD_Q, vlen_enc);
// Final result is same as first source if its a NaN value,
// in case second operand holds a NaN value then as per above semantics
// result is same as second operand.
Assembler::evmovdquw(dst, ktmp, xtmp1, true, vlen_enc);
} else {
- assert(opcode == Op_MinHF, "");
+ assert(opcode == Op_MinVHF || opcode == Op_MinHF, "");
// Move sign bits of src1 to mask register.
evpmovw2m(ktmp, src1, vlen_enc);
// xtmp1 = src1 < 0 ? src2 : src1
@@ -7125,9 +7158,9 @@ void C2_MacroAssembler::scalar_max_min_fp16(int opcode, XMMRegister dst, XMMRegi
// If only one value is a NaN (SNaN or QNaN) for this instruction, the second source operand, either a NaN
// or a valid floating-point value, is written to the result.
// dst = min(xtmp1, xtmp2)
- vminsh(dst, xtmp1, xtmp2);
+ evminph(dst, xtmp1, xtmp2, vlen_enc);
// isNaN = is_unordered_quiet(xtmp1)
- evcmpsh(ktmp, k0, xtmp1, xtmp1, Assembler::UNORD_Q);
+ evcmpph(ktmp, k0, xtmp1, xtmp1, Assembler::UNORD_Q, vlen_enc);
// Final result is same as first source if its a NaN value,
// in case second operand holds a NaN value then as per above semantics
// result is same as second operand.
diff --git a/src/hotspot/cpu/x86/c2_MacroAssembler_x86.hpp b/src/hotspot/cpu/x86/c2_MacroAssembler_x86.hpp
index 29380609b9a1e..713eb73d68f38 100644
--- a/src/hotspot/cpu/x86/c2_MacroAssembler_x86.hpp
+++ b/src/hotspot/cpu/x86/c2_MacroAssembler_x86.hpp
@@ -44,6 +44,9 @@
Register t, Register thread);
void fast_unlock_lightweight(Register obj, Register reg_rax, Register t, Register thread);
+ void verify_int_in_range(uint idx, const TypeInt* t, Register val);
+ void verify_long_in_range(uint idx, const TypeLong* t, Register val, Register tmp);
+
// Generic instructions support for use in .ad files C2 code generation
void vabsnegd(int opcode, XMMRegister dst, XMMRegister src);
void vabsnegd(int opcode, XMMRegister dst, XMMRegister src, int vector_len);
@@ -130,9 +133,7 @@
// Covert B2X
void vconvert_b2x(BasicType to_elem_bt, XMMRegister dst, XMMRegister src, int vlen_enc);
-#ifdef _LP64
void vpbroadcast(BasicType elem_bt, XMMRegister dst, Register src, int vlen_enc);
-#endif
// blend
void evpcmp(BasicType typ, KRegister kdmask, KRegister ksmask, XMMRegister src1, XMMRegister src2, int comparison, int vector_len);
@@ -152,10 +153,8 @@
// dst = src1 reduce(op, src2) using vtmp as temps
void reduceI(int opcode, int vlen, Register dst, Register src1, XMMRegister src2, XMMRegister vtmp1, XMMRegister vtmp2);
-#ifdef _LP64
void reduceL(int opcode, int vlen, Register dst, Register src1, XMMRegister src2, XMMRegister vtmp1, XMMRegister vtmp2);
void genmask(KRegister dst, Register len, Register temp);
-#endif // _LP64
// dst = reduce(op, src2) using vtmp as temps
void reduce_fp(int opcode, int vlen,
@@ -202,11 +201,9 @@
void reduce32S(int opcode, Register dst, Register src1, XMMRegister src2, XMMRegister vtmp1, XMMRegister vtmp2);
// Long Reduction
-#ifdef _LP64
void reduce2L(int opcode, Register dst, Register src1, XMMRegister src2, XMMRegister vtmp1, XMMRegister vtmp2);
void reduce4L(int opcode, Register dst, Register src1, XMMRegister src2, XMMRegister vtmp1, XMMRegister vtmp2);
void reduce8L(int opcode, Register dst, Register src1, XMMRegister src2, XMMRegister vtmp1, XMMRegister vtmp2);
-#endif // _LP64
// Float Reduction
void reduce2F (int opcode, XMMRegister dst, XMMRegister src, XMMRegister vtmp);
@@ -237,7 +234,6 @@
void unordered_reduce_operation_256(BasicType typ, int opcode, XMMRegister dst, XMMRegister src1, XMMRegister src2);
public:
-#ifdef _LP64
void vector_mask_operation_helper(int opc, Register dst, Register tmp, int masklen);
void vector_mask_operation(int opc, Register dst, KRegister mask, Register tmp, int masklen, int masksize, int vec_enc);
@@ -246,14 +242,9 @@
Register tmp, int masklen, BasicType bt, int vec_enc);
void vector_long_to_maskvec(XMMRegister dst, Register src, Register rtmp1,
Register rtmp2, XMMRegister xtmp, int mask_len, int vec_enc);
-#endif
void vector_maskall_operation(KRegister dst, Register src, int mask_len);
-#ifndef _LP64
- void vector_maskall_operation32(KRegister dst, Register src, KRegister ktmp, int mask_len);
-#endif
-
void string_indexof_char(Register str1, Register cnt1, Register ch, Register result,
XMMRegister vec1, XMMRegister vec2, XMMRegister vec3, Register tmp);
@@ -313,9 +304,7 @@
void arrays_hashcode_elvload(XMMRegister dst, AddressLiteral src, BasicType eltype);
void arrays_hashcode_elvcast(XMMRegister dst, BasicType eltype);
-#ifdef _LP64
void convertF2I(BasicType dst_bt, BasicType src_bt, Register dst, XMMRegister src);
-#endif
void evmasked_op(int ideal_opc, BasicType eType, KRegister mask,
XMMRegister dst, XMMRegister src1, XMMRegister src2,
@@ -390,7 +379,6 @@
void vector_mask_cast(XMMRegister dst, XMMRegister src, BasicType dst_bt, BasicType src_bt, int vlen);
-#ifdef _LP64
void vector_round_double_evex(XMMRegister dst, XMMRegister src, AddressLiteral double_sign_flip, AddressLiteral new_mxcsr, int vec_enc,
Register tmp, XMMRegister xtmp1, XMMRegister xtmp2, KRegister ktmp1, KRegister ktmp2);
@@ -403,13 +391,11 @@
void vector_compress_expand_avx2(int opcode, XMMRegister dst, XMMRegister src, XMMRegister mask,
Register rtmp, Register rscratch, XMMRegister permv, XMMRegister xtmp,
BasicType bt, int vec_enc);
-#endif // _LP64
void udivI(Register rax, Register divisor, Register rdx);
void umodI(Register rax, Register divisor, Register rdx);
void udivmodI(Register rax, Register divisor, Register rdx, Register tmp);
-#ifdef _LP64
void reverseI(Register dst, Register src, XMMRegister xtmp1,
XMMRegister xtmp2, Register rtmp);
void reverseL(Register dst, Register src, XMMRegister xtmp1,
@@ -417,7 +403,6 @@
void udivL(Register rax, Register divisor, Register rdx);
void umodL(Register rax, Register divisor, Register rdx);
void udivmodL(Register rax, Register divisor, Register rdx, Register tmp);
-#endif
void evpternlog(XMMRegister dst, int func, KRegister mask, XMMRegister src2, XMMRegister src3,
bool merge, BasicType bt, int vlen_enc);
@@ -511,10 +496,9 @@
Register mask, XMMRegister xtmp1, XMMRegister xtmp2, XMMRegister xtmp3, Register rtmp,
Register midx, Register length, int vector_len, int vlen_enc);
-#ifdef _LP64
void vgather8b_masked_offset(BasicType elem_bt, XMMRegister dst, Register base, Register idx_base,
Register offset, Register mask, Register midx, Register rtmp, int vlen_enc);
-#endif
+
void vgather8b_offset(BasicType elem_bt, XMMRegister dst, Register base, Register idx_base,
Register offset, Register rtmp, int vlen_enc);
@@ -584,6 +568,16 @@
void select_from_two_vectors_evex(BasicType elem_bt, XMMRegister dst, XMMRegister src1, XMMRegister src2, int vlen_enc);
+ void evfp16ph(int opcode, XMMRegister dst, XMMRegister src1, XMMRegister src2, int vlen_enc);
+
+ void evfp16ph(int opcode, XMMRegister dst, XMMRegister src1, Address src2, int vlen_enc);
+
+ void vector_max_min_fp16(int opcode, XMMRegister dst, XMMRegister src1, XMMRegister src2,
+ KRegister ktmp, XMMRegister xtmp1, XMMRegister xtmp2, int vlen_enc);
+
void scalar_max_min_fp16(int opcode, XMMRegister dst, XMMRegister src1, XMMRegister src2,
- KRegister ktmp, XMMRegister xtmp1, XMMRegister xtmp2, int vlen_enc);
+ KRegister ktmp, XMMRegister xtmp1, XMMRegister xtmp2);
+
+ void reconstruct_frame_pointer(Register rtmp);
+
#endif // CPU_X86_C2_MACROASSEMBLER_X86_HPP
diff --git a/src/hotspot/cpu/x86/continuationFreezeThaw_x86.inline.hpp b/src/hotspot/cpu/x86/continuationFreezeThaw_x86.inline.hpp
index ba8fcb3aa9c51..9591c9f2c966b 100644
--- a/src/hotspot/cpu/x86/continuationFreezeThaw_x86.inline.hpp
+++ b/src/hotspot/cpu/x86/continuationFreezeThaw_x86.inline.hpp
@@ -261,15 +261,12 @@ template frame ThawBase::new_stack_frame(const frame& hf, frame&
}
inline intptr_t* ThawBase::align(const frame& hf, intptr_t* frame_sp, frame& caller, bool bottom) {
-#ifdef _LP64
if (((intptr_t)frame_sp & 0xf) != 0) {
assert(caller.is_interpreted_frame() || (bottom && hf.compiled_frame_stack_argsize() % 2 != 0), "");
frame_sp--;
caller.set_sp(caller.sp() - 1);
}
assert(is_aligned(frame_sp, frame::frame_alignment), "");
-#endif
-
return frame_sp;
}
diff --git a/src/hotspot/cpu/x86/continuationHelper_x86.inline.hpp b/src/hotspot/cpu/x86/continuationHelper_x86.inline.hpp
index 46fe0946951e5..6d72e1b80e893 100644
--- a/src/hotspot/cpu/x86/continuationHelper_x86.inline.hpp
+++ b/src/hotspot/cpu/x86/continuationHelper_x86.inline.hpp
@@ -55,18 +55,11 @@ static inline void patch_return_pc_with_preempt_stub(frame& f) {
}
inline int ContinuationHelper::frame_align_words(int size) {
-#ifdef _LP64
return size & 1;
-#else
- return 0;
-#endif
}
inline intptr_t* ContinuationHelper::frame_align_pointer(intptr_t* sp) {
-#ifdef _LP64
- sp = align_down(sp, frame::frame_alignment);
-#endif
- return sp;
+ return align_down(sp, frame::frame_alignment);
}
template
diff --git a/src/hotspot/cpu/x86/downcallLinker_x86_64.cpp b/src/hotspot/cpu/x86/downcallLinker_x86_64.cpp
index 15c311ffd39b7..c48940198ea89 100644
--- a/src/hotspot/cpu/x86/downcallLinker_x86_64.cpp
+++ b/src/hotspot/cpu/x86/downcallLinker_x86_64.cpp
@@ -291,7 +291,7 @@ void DowncallLinker::StubGenerator::generate() {
Assembler::StoreLoad | Assembler::StoreStore));
}
- __ safepoint_poll(L_safepoint_poll_slow_path, r15_thread, true /* at_return */, false /* in_nmethod */);
+ __ safepoint_poll(L_safepoint_poll_slow_path, true /* at_return */, false /* in_nmethod */);
__ cmpl(Address(r15_thread, JavaThread::suspend_flags_offset()), 0);
__ jcc(Assembler::notEqual, L_safepoint_poll_slow_path);
@@ -305,7 +305,7 @@ void DowncallLinker::StubGenerator::generate() {
__ jcc(Assembler::equal, L_reguard);
__ bind(L_after_reguard);
- __ reset_last_Java_frame(r15_thread, true);
+ __ reset_last_Java_frame(true);
__ block_comment("} thread native2java");
}
diff --git a/src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp b/src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp
index 4aa02c4d6278b..bc5d6a233d389 100644
--- a/src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp
+++ b/src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp
@@ -49,11 +49,7 @@ void G1BarrierSetAssembler::gen_write_ref_array_pre_barrier(MacroAssembler* masm
bool dest_uninitialized = (decorators & IS_DEST_UNINITIALIZED) != 0;
if (!dest_uninitialized) {
- Register thread = NOT_LP64(rax) LP64_ONLY(r15_thread);
-#ifndef _LP64
- __ push(thread);
- __ get_thread(thread);
-#endif
+ Register thread = r15_thread;
Label filtered;
Address in_progress(thread, in_bytes(G1ThreadLocalData::satb_mark_queue_active_offset()));
@@ -65,12 +61,9 @@ void G1BarrierSetAssembler::gen_write_ref_array_pre_barrier(MacroAssembler* masm
__ cmpb(in_progress, 0);
}
- NOT_LP64(__ pop(thread);)
-
__ jcc(Assembler::equal, filtered);
__ push_call_clobbered_registers(false /* save_fpu */);
-#ifdef _LP64
if (count == c_rarg0) {
if (addr == c_rarg1) {
// exactly backwards!!
@@ -88,10 +81,6 @@ void G1BarrierSetAssembler::gen_write_ref_array_pre_barrier(MacroAssembler* masm
} else {
__ call_VM_leaf(CAST_FROM_FN_PTR(address, G1BarrierSetRuntime::write_ref_array_pre_oop_entry), 2);
}
-#else
- __ call_VM_leaf(CAST_FROM_FN_PTR(address, G1BarrierSetRuntime::write_ref_array_pre_oop_entry),
- addr, count);
-#endif
__ pop_call_clobbered_registers(false /* save_fpu */);
__ bind(filtered);
@@ -101,7 +90,6 @@ void G1BarrierSetAssembler::gen_write_ref_array_pre_barrier(MacroAssembler* masm
void G1BarrierSetAssembler::gen_write_ref_array_post_barrier(MacroAssembler* masm, DecoratorSet decorators,
Register addr, Register count, Register tmp) {
__ push_call_clobbered_registers(false /* save_fpu */);
-#ifdef _LP64
if (c_rarg0 == count) { // On win64 c_rarg0 == rcx
assert_different_registers(c_rarg1, addr);
__ mov(c_rarg1, count);
@@ -112,53 +100,26 @@ void G1BarrierSetAssembler::gen_write_ref_array_post_barrier(MacroAssembler* mas
__ mov(c_rarg1, count);
}
__ call_VM_leaf(CAST_FROM_FN_PTR(address, G1BarrierSetRuntime::write_ref_array_post_entry), 2);
-#else
- __ call_VM_leaf(CAST_FROM_FN_PTR(address, G1BarrierSetRuntime::write_ref_array_post_entry),
- addr, count);
-#endif
__ pop_call_clobbered_registers(false /* save_fpu */);
+
}
void G1BarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type,
- Register dst, Address src, Register tmp1, Register tmp_thread) {
+ Register dst, Address src, Register tmp1) {
bool on_oop = is_reference_type(type);
bool on_weak = (decorators & ON_WEAK_OOP_REF) != 0;
bool on_phantom = (decorators & ON_PHANTOM_OOP_REF) != 0;
bool on_reference = on_weak || on_phantom;
- ModRefBarrierSetAssembler::load_at(masm, decorators, type, dst, src, tmp1, tmp_thread);
+ ModRefBarrierSetAssembler::load_at(masm, decorators, type, dst, src, tmp1);
if (on_oop && on_reference) {
- Register thread = NOT_LP64(tmp_thread) LP64_ONLY(r15_thread);
-
-#ifndef _LP64
- // Work around the x86_32 bug that only manifests with Loom for some reason.
- // MacroAssembler::resolve_weak_handle calls this barrier with tmp_thread == noreg.
- if (thread == noreg) {
- if (dst != rcx && tmp1 != rcx) {
- thread = rcx;
- } else if (dst != rdx && tmp1 != rdx) {
- thread = rdx;
- } else if (dst != rdi && tmp1 != rdi) {
- thread = rdi;
- }
- }
- assert_different_registers(dst, tmp1, thread);
- __ push(thread);
- __ get_thread(thread);
-#endif
-
// Generate the G1 pre-barrier code to log the value of
// the referent field in an SATB buffer.
g1_write_barrier_pre(masm /* masm */,
noreg /* obj */,
dst /* pre_val */,
- thread /* thread */,
tmp1 /* tmp */,
true /* tosca_live */,
true /* expand_call */);
-
-#ifndef _LP64
- __ pop(thread);
-#endif
}
}
@@ -199,7 +160,7 @@ static void generate_pre_barrier_slow_path(MacroAssembler* masm,
Label& runtime) {
// Do we need to load the previous value?
if (obj != noreg) {
- __ load_heap_oop(pre_val, Address(obj, 0), noreg, noreg, AS_RAW);
+ __ load_heap_oop(pre_val, Address(obj, 0), noreg, AS_RAW);
}
// Is the previous value null?
__ cmpptr(pre_val, NULL_WORD);
@@ -215,7 +176,6 @@ static void generate_pre_barrier_slow_path(MacroAssembler* masm,
void G1BarrierSetAssembler::g1_write_barrier_pre(MacroAssembler* masm,
Register obj,
Register pre_val,
- Register thread,
Register tmp,
bool tosca_live,
bool expand_call) {
@@ -223,9 +183,7 @@ void G1BarrierSetAssembler::g1_write_barrier_pre(MacroAssembler* masm,
// directly to skip generating the check by
// InterpreterMacroAssembler::call_VM_leaf_base that checks _last_sp.
-#ifdef _LP64
- assert(thread == r15_thread, "must be");
-#endif // _LP64
+ const Register thread = r15_thread;
Label done;
Label runtime;
@@ -260,18 +218,13 @@ void G1BarrierSetAssembler::g1_write_barrier_pre(MacroAssembler* masm,
// expand_call should be passed true.
if (expand_call) {
- LP64_ONLY( assert(pre_val != c_rarg1, "smashed arg"); )
-#ifdef _LP64
+ assert(pre_val != c_rarg1, "smashed arg");
if (c_rarg1 != thread) {
__ mov(c_rarg1, thread);
}
if (c_rarg0 != pre_val) {
__ mov(c_rarg0, pre_val);
}
-#else
- __ push(thread);
- __ push(pre_val);
-#endif
__ MacroAssembler::call_VM_leaf_base(CAST_FROM_FN_PTR(address, G1BarrierSetRuntime::write_ref_field_pre_entry), 2);
} else {
__ call_VM_leaf(CAST_FROM_FN_PTR(address, G1BarrierSetRuntime::write_ref_field_pre_entry), pre_val, thread);
@@ -333,12 +286,9 @@ static void generate_post_barrier_slow_path(MacroAssembler* masm,
void G1BarrierSetAssembler::g1_write_barrier_post(MacroAssembler* masm,
Register store_addr,
Register new_val,
- Register thread,
Register tmp,
Register tmp2) {
-#ifdef _LP64
- assert(thread == r15_thread, "must be");
-#endif // _LP64
+ const Register thread = r15_thread;
Label done;
Label runtime;
@@ -350,7 +300,7 @@ void G1BarrierSetAssembler::g1_write_barrier_post(MacroAssembler* masm,
__ bind(runtime);
// save the live input values
- RegSet saved = RegSet::of(store_addr NOT_LP64(COMMA thread));
+ RegSet saved = RegSet::of(store_addr);
__ push_set(saved);
__ call_VM_leaf(CAST_FROM_FN_PTR(address, G1BarrierSetRuntime::write_ref_field_post_entry), tmp, thread);
__ pop_set(saved);
@@ -361,7 +311,6 @@ void G1BarrierSetAssembler::g1_write_barrier_post(MacroAssembler* masm,
#if defined(COMPILER2)
static void generate_c2_barrier_runtime_call(MacroAssembler* masm, G1BarrierStubC2* stub, const Register arg, const address runtime_path) {
-#ifdef _LP64
SaveLiveRegisters save_registers(masm, stub);
if (c_rarg0 != arg) {
__ mov(c_rarg0, arg);
@@ -373,20 +322,15 @@ static void generate_c2_barrier_runtime_call(MacroAssembler* masm, G1BarrierStub
// call. If it did not contain any live value, it is free to be used. In
// either case, it is safe to use it here as a call scratch register.
__ call(RuntimeAddress(runtime_path), rax);
-#else
- Unimplemented();
-#endif // _LP64
}
void G1BarrierSetAssembler::g1_write_barrier_pre_c2(MacroAssembler* masm,
Register obj,
Register pre_val,
- Register thread,
Register tmp,
G1PreBarrierStubC2* stub) {
-#ifdef _LP64
- assert(thread == r15_thread, "must be");
-#endif // _LP64
+ const Register thread = r15_thread;
+
assert(pre_val != noreg, "check this code");
if (obj != noreg) {
assert_different_registers(obj, pre_val, tmp);
@@ -422,14 +366,10 @@ void G1BarrierSetAssembler::generate_c2_pre_barrier_stub(MacroAssembler* masm,
void G1BarrierSetAssembler::g1_write_barrier_post_c2(MacroAssembler* masm,
Register store_addr,
Register new_val,
- Register thread,
Register tmp,
Register tmp2,
G1PostBarrierStubC2* stub) {
-#ifdef _LP64
- assert(thread == r15_thread, "must be");
-#endif // _LP64
-
+ const Register thread = r15_thread;
stub->initialize_registers(thread, tmp, tmp2);
bool new_val_may_be_null = (stub->barrier_data() & G1C2BarrierPostNotNull) == 0;
@@ -467,7 +407,6 @@ void G1BarrierSetAssembler::oop_store_at(MacroAssembler* masm, DecoratorSet deco
bool needs_pre_barrier = as_normal;
bool needs_post_barrier = val != noreg && in_heap;
- Register rthread = LP64_ONLY(r15_thread) NOT_LP64(rcx);
// flatten object address if needed
// We do it regardless of precise because we need the registers
if (dst.index() == noreg && dst.disp() == 0) {
@@ -478,18 +417,10 @@ void G1BarrierSetAssembler::oop_store_at(MacroAssembler* masm, DecoratorSet deco
__ lea(tmp1, dst);
}
-#ifndef _LP64
- InterpreterMacroAssembler *imasm = static_cast(masm);
-#endif
-
- NOT_LP64(__ get_thread(rcx));
- NOT_LP64(imasm->save_bcp());
-
if (needs_pre_barrier) {
g1_write_barrier_pre(masm /*masm*/,
tmp1 /* obj */,
tmp2 /* pre_val */,
- rthread /* thread */,
tmp3 /* tmp */,
val != noreg /* tosca_live */,
false /* expand_call */);
@@ -510,12 +441,10 @@ void G1BarrierSetAssembler::oop_store_at(MacroAssembler* masm, DecoratorSet deco
g1_write_barrier_post(masm /*masm*/,
tmp1 /* store_adr */,
new_val /* new_val */,
- rthread /* thread */,
tmp3 /* tmp */,
tmp2 /* tmp2 */);
}
}
- NOT_LP64(imasm->restore_bcp());
}
#ifdef COMPILER1
@@ -575,11 +504,9 @@ void G1BarrierSetAssembler::generate_c1_pre_barrier_runtime_stub(StubAssembler*
__ push(rdx);
const Register pre_val = rax;
- const Register thread = NOT_LP64(rax) LP64_ONLY(r15_thread);
+ const Register thread = r15_thread;
const Register tmp = rdx;
- NOT_LP64(__ get_thread(thread);)
-
Address queue_active(thread, in_bytes(G1ThreadLocalData::satb_mark_queue_active_offset()));
Address queue_index(thread, in_bytes(G1ThreadLocalData::satb_mark_queue_index_offset()));
Address buffer(thread, in_bytes(G1ThreadLocalData::satb_mark_queue_buffer_offset()));
@@ -641,7 +568,7 @@ void G1BarrierSetAssembler::generate_c1_post_barrier_runtime_stub(StubAssembler*
// At this point we know new_value is non-null and the new_value crosses regions.
// Must check to see if card is already dirty
- const Register thread = NOT_LP64(rax) LP64_ONLY(r15_thread);
+ const Register thread = r15_thread;
Address queue_index(thread, in_bytes(G1ThreadLocalData::dirty_card_queue_index_offset()));
Address buffer(thread, in_bytes(G1ThreadLocalData::dirty_card_queue_buffer_offset()));
@@ -659,8 +586,6 @@ void G1BarrierSetAssembler::generate_c1_post_barrier_runtime_stub(StubAssembler*
__ movptr(cardtable, (intptr_t)ct->card_table()->byte_map_base());
__ addptr(card_addr, cardtable);
- NOT_LP64(__ get_thread(thread);)
-
__ cmpb(Address(card_addr, 0), G1CardTable::g1_young_card_val());
__ jcc(Assembler::equal, done);
diff --git a/src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.hpp b/src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.hpp
index 237786a84d243..774e87b916c65 100644
--- a/src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.hpp
+++ b/src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.hpp
@@ -44,7 +44,6 @@ class G1BarrierSetAssembler: public ModRefBarrierSetAssembler {
void g1_write_barrier_pre(MacroAssembler* masm,
Register obj,
Register pre_val,
- Register thread,
Register tmp,
bool tosca_live,
bool expand_call);
@@ -52,7 +51,6 @@ class G1BarrierSetAssembler: public ModRefBarrierSetAssembler {
void g1_write_barrier_post(MacroAssembler* masm,
Register store_addr,
Register new_val,
- Register thread,
Register tmp,
Register tmp2);
@@ -67,13 +65,12 @@ class G1BarrierSetAssembler: public ModRefBarrierSetAssembler {
void generate_c1_post_barrier_runtime_stub(StubAssembler* sasm);
virtual void load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type,
- Register dst, Address src, Register tmp1, Register tmp_thread);
+ Register dst, Address src, Register tmp1);
#ifdef COMPILER2
void g1_write_barrier_pre_c2(MacroAssembler* masm,
Register obj,
Register pre_val,
- Register thread,
Register tmp,
G1PreBarrierStubC2* c2_stub);
void generate_c2_pre_barrier_stub(MacroAssembler* masm,
@@ -81,7 +78,6 @@ class G1BarrierSetAssembler: public ModRefBarrierSetAssembler {
void g1_write_barrier_post_c2(MacroAssembler* masm,
Register store_addr,
Register new_val,
- Register thread,
Register tmp,
Register tmp2,
G1PostBarrierStubC2* c2_stub);
diff --git a/src/hotspot/cpu/x86/gc/g1/g1_x86_64.ad b/src/hotspot/cpu/x86/gc/g1/g1_x86_64.ad
index 8c1559f90f46d..819cd97696c15 100644
--- a/src/hotspot/cpu/x86/gc/g1/g1_x86_64.ad
+++ b/src/hotspot/cpu/x86/gc/g1/g1_x86_64.ad
@@ -52,7 +52,7 @@ static void write_barrier_pre(MacroAssembler* masm,
for (RegSetIterator reg = no_preserve.begin(); *reg != noreg; ++reg) {
stub->dont_preserve(*reg);
}
- g1_asm->g1_write_barrier_pre_c2(masm, obj, pre_val, r15_thread, tmp, stub);
+ g1_asm->g1_write_barrier_pre_c2(masm, obj, pre_val, tmp, stub);
}
static void write_barrier_post(MacroAssembler* masm,
@@ -67,7 +67,7 @@ static void write_barrier_post(MacroAssembler* masm,
Assembler::InlineSkippedInstructionsCounter skip_counter(masm);
G1BarrierSetAssembler* g1_asm = static_cast(BarrierSet::barrier_set()->barrier_set_assembler());
G1PostBarrierStubC2* const stub = G1PostBarrierStubC2::create(node);
- g1_asm->g1_write_barrier_post_c2(masm, store_addr, new_val, r15_thread, tmp1, tmp2, stub);
+ g1_asm->g1_write_barrier_post_c2(masm, store_addr, new_val, tmp1, tmp2, stub);
}
%}
diff --git a/src/hotspot/cpu/x86/gc/shared/barrierSetAssembler_x86.cpp b/src/hotspot/cpu/x86/gc/shared/barrierSetAssembler_x86.cpp
index 5962609d08ede..925444792caac 100644
--- a/src/hotspot/cpu/x86/gc/shared/barrierSetAssembler_x86.cpp
+++ b/src/hotspot/cpu/x86/gc/shared/barrierSetAssembler_x86.cpp
@@ -40,7 +40,7 @@
#define __ masm->
void BarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type,
- Register dst, Address src, Register tmp1, Register tmp_thread) {
+ Register dst, Address src, Register tmp1) {
bool in_heap = (decorators & IN_HEAP) != 0;
bool in_native = (decorators & IN_NATIVE) != 0;
bool is_not_null = (decorators & IS_NOT_NULL) != 0;
@@ -50,7 +50,6 @@ void BarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators,
case T_OBJECT:
case T_ARRAY: {
if (in_heap) {
-#ifdef _LP64
if (UseCompressedOops) {
__ movl(dst, src);
if (is_not_null) {
@@ -58,9 +57,7 @@ void BarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators,
} else {
__ decode_heap_oop(dst);
}
- } else
-#endif
- {
+ } else {
__ movptr(dst, src);
}
} else {
@@ -77,28 +74,15 @@ void BarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators,
case T_ADDRESS: __ movptr(dst, src); break;
case T_FLOAT:
assert(dst == noreg, "only to ftos");
- __ load_float(src);
+ __ movflt(xmm0, src);
break;
case T_DOUBLE:
assert(dst == noreg, "only to dtos");
- __ load_double(src);
+ __ movdbl(xmm0, src);
break;
case T_LONG:
assert(dst == noreg, "only to ltos");
-#ifdef _LP64
__ movq(rax, src);
-#else
- if (atomic) {
- __ fild_d(src); // Must load atomically
- __ subptr(rsp,2*wordSize); // Make space for store
- __ fistp_d(Address(rsp,0));
- __ pop(rax);
- __ pop(rdx);
- } else {
- __ movl(rax, src);
- __ movl(rdx, src.plus_disp(wordSize));
- }
-#endif
break;
default: Unimplemented();
}
@@ -117,17 +101,12 @@ void BarrierSetAssembler::store_at(MacroAssembler* masm, DecoratorSet decorators
if (in_heap) {
if (val == noreg) {
assert(!is_not_null, "inconsistent access");
-#ifdef _LP64
if (UseCompressedOops) {
__ movl(dst, NULL_WORD);
} else {
__ movslq(dst, NULL_WORD);
}
-#else
- __ movl(dst, NULL_WORD);
-#endif
} else {
-#ifdef _LP64
if (UseCompressedOops) {
assert(!dst.uses(val), "not enough registers");
if (is_not_null) {
@@ -136,9 +115,7 @@ void BarrierSetAssembler::store_at(MacroAssembler* masm, DecoratorSet decorators
__ encode_heap_oop(val);
}
__ movl(dst, val);
- } else
-#endif
- {
+ } else {
__ movptr(dst, val);
}
}
@@ -167,28 +144,15 @@ void BarrierSetAssembler::store_at(MacroAssembler* masm, DecoratorSet decorators
break;
case T_LONG:
assert(val == noreg, "only tos");
-#ifdef _LP64
__ movq(dst, rax);
-#else
- if (atomic) {
- __ push(rdx);
- __ push(rax); // Must update atomically with FIST
- __ fild_d(Address(rsp,0)); // So load into FPU register
- __ fistp_d(dst); // and put into memory atomically
- __ addptr(rsp, 2*wordSize);
- } else {
- __ movptr(dst, rax);
- __ movptr(dst.plus_disp(wordSize), rdx);
- }
-#endif
break;
case T_FLOAT:
assert(val == noreg, "only tos");
- __ store_float(dst);
+ __ movflt(dst, xmm0);
break;
case T_DOUBLE:
assert(val == noreg, "only tos");
- __ store_double(dst);
+ __ movdbl(dst, xmm0);
break;
case T_ADDRESS:
__ movptr(dst, val);
@@ -216,20 +180,14 @@ void BarrierSetAssembler::copy_load_at(MacroAssembler* masm,
__ movl(dst, src);
break;
case 8:
-#ifdef _LP64
__ movq(dst, src);
-#else
- fatal("No support for 8 bytes copy");
-#endif
break;
default:
fatal("Unexpected size");
}
-#ifdef _LP64
if ((decorators & ARRAYCOPY_CHECKCAST) != 0 && UseCompressedOops) {
__ decode_heap_oop(dst);
}
-#endif
}
void BarrierSetAssembler::copy_store_at(MacroAssembler* masm,
@@ -239,11 +197,9 @@ void BarrierSetAssembler::copy_store_at(MacroAssembler* masm,
Address dst,
Register src,
Register tmp) {
-#ifdef _LP64
if ((decorators & ARRAYCOPY_CHECKCAST) != 0 && UseCompressedOops) {
__ encode_heap_oop(src);
}
-#endif
assert(bytes <= 8, "can only deal with non-vector registers");
switch (bytes) {
case 1:
@@ -256,11 +212,7 @@ void BarrierSetAssembler::copy_store_at(MacroAssembler* masm,
__ movl(dst, src);
break;
case 8:
-#ifdef _LP64
__ movq(dst, src);
-#else
- fatal("No support for 8 bytes copy");
-#endif
break;
default:
fatal("Unexpected size");
@@ -311,7 +263,7 @@ void BarrierSetAssembler::try_resolve_jobject_in_native(MacroAssembler* masm, Re
}
void BarrierSetAssembler::tlab_allocate(MacroAssembler* masm,
- Register thread, Register obj,
+ Register obj,
Register var_size_in_bytes,
int con_size_in_bytes,
Register t1,
@@ -320,15 +272,8 @@ void BarrierSetAssembler::tlab_allocate(MacroAssembler* masm,
assert_different_registers(obj, t1, t2);
assert_different_registers(obj, var_size_in_bytes, t1);
Register end = t2;
- if (!thread->is_valid()) {
-#ifdef _LP64
- thread = r15_thread;
-#else
- assert(t1->is_valid(), "need temp reg");
- thread = t1;
- __ get_thread(thread);
-#endif
- }
+
+ const Register thread = r15_thread;
__ verify_tlab();
@@ -351,7 +296,6 @@ void BarrierSetAssembler::tlab_allocate(MacroAssembler* masm,
__ verify_tlab();
}
-#ifdef _LP64
void BarrierSetAssembler::nmethod_entry_barrier(MacroAssembler* masm, Label* slow_path, Label* continuation) {
BarrierSetNMethod* bs_nm = BarrierSet::barrier_set()->barrier_set_nmethod();
Register thread = r15_thread;
@@ -375,35 +319,14 @@ void BarrierSetAssembler::nmethod_entry_barrier(MacroAssembler* masm, Label* slo
__ bind(done);
}
}
-#else
-void BarrierSetAssembler::nmethod_entry_barrier(MacroAssembler* masm, Label*, Label*) {
- BarrierSetNMethod* bs_nm = BarrierSet::barrier_set()->barrier_set_nmethod();
- Label continuation;
-
- Register tmp = rdi;
- __ push(tmp);
- __ movptr(tmp, (intptr_t)bs_nm->disarmed_guard_value_address());
- Address disarmed_addr(tmp, 0);
- __ align(4);
- __ cmpl_imm32(disarmed_addr, 0);
- __ pop(tmp);
- __ jcc(Assembler::equal, continuation);
- __ call(RuntimeAddress(StubRoutines::method_entry_barrier()));
- __ bind(continuation);
-}
-#endif
void BarrierSetAssembler::c2i_entry_barrier(MacroAssembler* masm) {
Label bad_call;
__ cmpptr(rbx, 0); // rbx contains the incoming method for c2i adapters.
__ jcc(Assembler::equal, bad_call);
- Register tmp1 = LP64_ONLY( rscratch1 ) NOT_LP64( rax );
- Register tmp2 = LP64_ONLY( rscratch2 ) NOT_LP64( rcx );
-#ifndef _LP64
- __ push(tmp1);
- __ push(tmp2);
-#endif // !_LP64
+ Register tmp1 = rscratch1;
+ Register tmp2 = rscratch2;
// Pointer chase to the method holder to find out if the method is concurrently unloading.
Label method_live;
@@ -419,19 +342,9 @@ void BarrierSetAssembler::c2i_entry_barrier(MacroAssembler* masm) {
__ cmpptr(tmp1, 0);
__ jcc(Assembler::notEqual, method_live);
-#ifndef _LP64
- __ pop(tmp2);
- __ pop(tmp1);
-#endif
-
__ bind(bad_call);
__ jump(RuntimeAddress(SharedRuntime::get_handle_wrong_method_stub()));
__ bind(method_live);
-
-#ifndef _LP64
- __ pop(tmp2);
- __ pop(tmp1);
-#endif
}
void BarrierSetAssembler::check_oop(MacroAssembler* masm, Register obj, Register tmp1, Register tmp2, Label& error) {
@@ -451,8 +364,6 @@ void BarrierSetAssembler::check_oop(MacroAssembler* masm, Register obj, Register
#ifdef COMPILER2
-#ifdef _LP64
-
OptoReg::Name BarrierSetAssembler::refine_register(const Node* node, OptoReg::Name opto_reg) {
if (!OptoReg::is_reg(opto_reg)) {
return OptoReg::Bad;
@@ -728,12 +639,4 @@ SaveLiveRegisters::~SaveLiveRegisters() {
}
}
-#else // !_LP64
-
-OptoReg::Name BarrierSetAssembler::refine_register(const Node* node, OptoReg::Name opto_reg) {
- Unimplemented(); // This must be implemented to support late barrier expansion.
-}
-
-#endif // _LP64
-
#endif // COMPILER2
diff --git a/src/hotspot/cpu/x86/gc/shared/barrierSetAssembler_x86.hpp b/src/hotspot/cpu/x86/gc/shared/barrierSetAssembler_x86.hpp
index 5dde1c7aeedbb..fd52379d2e2bc 100644
--- a/src/hotspot/cpu/x86/gc/shared/barrierSetAssembler_x86.hpp
+++ b/src/hotspot/cpu/x86/gc/shared/barrierSetAssembler_x86.hpp
@@ -44,7 +44,7 @@ class BarrierSetAssembler: public CHeapObj {
Register src, Register dst, Register count) {}
virtual void load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type,
- Register dst, Address src, Register tmp1, Register tmp_thread);
+ Register dst, Address src, Register tmp1);
virtual void store_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type,
Address dst, Register val, Register tmp1, Register tmp2, Register tmp3);
@@ -93,7 +93,7 @@ class BarrierSetAssembler: public CHeapObj {
Register obj, Register tmp, Label& slowpath);
virtual void tlab_allocate(MacroAssembler* masm,
- Register thread, Register obj,
+ Register obj,
Register var_size_in_bytes,
int con_size_in_bytes,
Register t1, Register t2,
@@ -114,8 +114,6 @@ class BarrierSetAssembler: public CHeapObj {
#ifdef COMPILER2
-#ifdef _LP64
-
// This class saves and restores the registers that need to be preserved across
// the runtime call represented by a given C2 barrier stub. Use as follows:
// {
@@ -160,8 +158,6 @@ class SaveLiveRegisters {
~SaveLiveRegisters();
};
-#endif // _LP64
-
#endif // COMPILER2
#endif // CPU_X86_GC_SHARED_BARRIERSETASSEMBLER_X86_HPP
diff --git a/src/hotspot/cpu/x86/gc/shared/barrierSetNMethod_x86.cpp b/src/hotspot/cpu/x86/gc/shared/barrierSetNMethod_x86.cpp
index e99774cbc401a..c27af4a29cd1f 100644
--- a/src/hotspot/cpu/x86/gc/shared/barrierSetNMethod_x86.cpp
+++ b/src/hotspot/cpu/x86/gc/shared/barrierSetNMethod_x86.cpp
@@ -39,7 +39,6 @@
class NativeNMethodCmpBarrier: public NativeInstruction {
public:
-#ifdef _LP64
enum Intel_specific_constants {
instruction_code = 0x81,
instruction_size = 8,
@@ -47,14 +46,6 @@ class NativeNMethodCmpBarrier: public NativeInstruction {
instruction_rex_prefix = Assembler::REX | Assembler::REX_B,
instruction_modrm = 0x7f // [r15 + offset]
};
-#else
- enum Intel_specific_constants {
- instruction_code = 0x81,
- instruction_size = 7,
- imm_offset = 2,
- instruction_modrm = 0x3f // [rdi]
- };
-#endif
address instruction_address() const { return addr_at(0); }
address immediate_address() const { return addr_at(imm_offset); }
@@ -70,7 +61,6 @@ class NativeNMethodCmpBarrier: public NativeInstruction {
}
};
-#ifdef _LP64
bool NativeNMethodCmpBarrier::check_barrier(err_msg& msg) const {
// Only require 4 byte alignment
if (((uintptr_t) instruction_address()) & 0x3) {
@@ -97,29 +87,6 @@ bool NativeNMethodCmpBarrier::check_barrier(err_msg& msg) const {
}
return true;
}
-#else
-bool NativeNMethodCmpBarrier::check_barrier(err_msg& msg) const {
- if (((uintptr_t) instruction_address()) & 0x3) {
- msg.print("Addr: " INTPTR_FORMAT " not properly aligned", p2i(instruction_address()));
- return false;
- }
-
- int inst = ubyte_at(0);
- if (inst != instruction_code) {
- msg.print("Addr: " INTPTR_FORMAT " Code: 0x%x", p2i(instruction_address()),
- inst);
- return false;
- }
-
- int modrm = ubyte_at(1);
- if (modrm != instruction_modrm) {
- msg.print("Addr: " INTPTR_FORMAT " mod/rm: 0x%x", p2i(instruction_address()),
- modrm);
- return false;
- }
- return true;
-}
-#endif // _LP64
void BarrierSetNMethod::deoptimize(nmethod* nm, address* return_address_ptr) {
/*
@@ -169,15 +136,11 @@ void BarrierSetNMethod::deoptimize(nmethod* nm, address* return_address_ptr) {
// not find the expected native instruction at this offset, which needs updating.
// Note that this offset is invariant of PreserveFramePointer.
static int entry_barrier_offset(nmethod* nm) {
-#ifdef _LP64
if (nm->is_compiled_by_c2()) {
return -14;
} else {
return -15;
}
-#else
- return -18;
-#endif
}
static NativeNMethodCmpBarrier* native_nmethod_barrier(nmethod* nm) {
diff --git a/src/hotspot/cpu/x86/gc/shared/cardTableBarrierSetAssembler_x86.cpp b/src/hotspot/cpu/x86/gc/shared/cardTableBarrierSetAssembler_x86.cpp
index 81284323e395e..ba89b09e4dcdc 100644
--- a/src/hotspot/cpu/x86/gc/shared/cardTableBarrierSetAssembler_x86.cpp
+++ b/src/hotspot/cpu/x86/gc/shared/cardTableBarrierSetAssembler_x86.cpp
@@ -57,7 +57,6 @@ void CardTableBarrierSetAssembler::gen_write_ref_array_post_barrier(MacroAssembl
__ jcc(Assembler::zero, L_done); // zero count - nothing to do
-#ifdef _LP64
__ leaq(end, Address(addr, count, TIMES_OOP, 0)); // end == addr+count*oop_size
__ subptr(end, BytesPerHeapOop); // end - 1 to make inclusive
__ shrptr(addr, CardTable::card_shift());
@@ -70,17 +69,6 @@ __ BIND(L_loop);
__ movb(Address(addr, count, Address::times_1), 0);
__ decrement(count);
__ jcc(Assembler::greaterEqual, L_loop);
-#else
- __ lea(end, Address(addr, count, Address::times_ptr, -wordSize));
- __ shrptr(addr, CardTable::card_shift());
- __ shrptr(end, CardTable::card_shift());
- __ subptr(end, addr); // end --> count
-__ BIND(L_loop);
- Address cardtable(addr, count, Address::times_1, disp);
- __ movb(cardtable, 0);
- __ decrement(count);
- __ jcc(Assembler::greaterEqual, L_loop);
-#endif
__ BIND(L_done);
}
diff --git a/src/hotspot/cpu/x86/gc/shared/modRefBarrierSetAssembler_x86.cpp b/src/hotspot/cpu/x86/gc/shared/modRefBarrierSetAssembler_x86.cpp
index 76066409a7caa..42109b069f2e0 100644
--- a/src/hotspot/cpu/x86/gc/shared/modRefBarrierSetAssembler_x86.cpp
+++ b/src/hotspot/cpu/x86/gc/shared/modRefBarrierSetAssembler_x86.cpp
@@ -31,10 +31,9 @@ void ModRefBarrierSetAssembler::arraycopy_prologue(MacroAssembler* masm, Decorat
Register src, Register dst, Register count) {
bool checkcast = (decorators & ARRAYCOPY_CHECKCAST) != 0;
bool disjoint = (decorators & ARRAYCOPY_DISJOINT) != 0;
- bool obj_int = type == T_OBJECT LP64_ONLY(&& UseCompressedOops);
+ bool obj_int = (type == T_OBJECT) && UseCompressedOops;
if (is_reference_type(type)) {
-#ifdef _LP64
if (!checkcast) {
if (!obj_int) {
// Save count for barrier
@@ -44,11 +43,6 @@ void ModRefBarrierSetAssembler::arraycopy_prologue(MacroAssembler* masm, Decorat
__ movq(r11, dst);
}
}
-#else
- if (disjoint) {
- __ mov(rdx, dst); // save 'to'
- }
-#endif
gen_write_ref_array_pre_barrier(masm, decorators, dst, count);
}
}
@@ -57,11 +51,10 @@ void ModRefBarrierSetAssembler::arraycopy_epilogue(MacroAssembler* masm, Decorat
Register src, Register dst, Register count) {
bool checkcast = (decorators & ARRAYCOPY_CHECKCAST) != 0;
bool disjoint = (decorators & ARRAYCOPY_DISJOINT) != 0;
- bool obj_int = type == T_OBJECT LP64_ONLY(&& UseCompressedOops);
+ bool obj_int = (type == T_OBJECT) && UseCompressedOops;
Register tmp = rax;
if (is_reference_type(type)) {
-#ifdef _LP64
if (!checkcast) {
if (!obj_int) {
// Save count for barrier
@@ -73,11 +66,6 @@ void ModRefBarrierSetAssembler::arraycopy_epilogue(MacroAssembler* masm, Decorat
} else {
tmp = rscratch1;
}
-#else
- if (disjoint) {
- __ mov(dst, rdx); // restore 'to'
- }
-#endif
gen_write_ref_array_post_barrier(masm, decorators, dst, count, tmp);
}
}
diff --git a/src/hotspot/cpu/x86/gc/shenandoah/c1/shenandoahBarrierSetC1_x86.cpp b/src/hotspot/cpu/x86/gc/shenandoah/c1/shenandoahBarrierSetC1_x86.cpp
index 063f4c2cc5ddf..66fb4cbb8c78d 100644
--- a/src/hotspot/cpu/x86/gc/shenandoah/c1/shenandoahBarrierSetC1_x86.cpp
+++ b/src/hotspot/cpu/x86/gc/shenandoah/c1/shenandoahBarrierSetC1_x86.cpp
@@ -26,14 +26,13 @@
#include "c1/c1_LIRAssembler.hpp"
#include "c1/c1_MacroAssembler.hpp"
#include "gc/shared/gc_globals.hpp"
+#include "gc/shenandoah/c1/shenandoahBarrierSetC1.hpp"
#include "gc/shenandoah/shenandoahBarrierSet.hpp"
#include "gc/shenandoah/shenandoahBarrierSetAssembler.hpp"
-#include "gc/shenandoah/c1/shenandoahBarrierSetC1.hpp"
#define __ masm->masm()->
void LIR_OpShenandoahCompareAndSwap::emit_code(LIR_Assembler* masm) {
- NOT_LP64(assert(_addr->is_single_cpu(), "must be single");)
Register addr = _addr->is_single_cpu() ? _addr->as_register() : _addr->as_register_lo();
Register newval = _new_value->as_register();
Register cmpval = _cmp_value->as_register();
@@ -46,14 +45,12 @@ void LIR_OpShenandoahCompareAndSwap::emit_code(LIR_Assembler* masm) {
assert(cmpval != addr, "cmp and addr must be in different registers");
assert(newval != addr, "new value and addr must be in different registers");
-#ifdef _LP64
if (UseCompressedOops) {
__ encode_heap_oop(cmpval);
__ mov(rscratch1, newval);
__ encode_heap_oop(rscratch1);
newval = rscratch1;
}
-#endif
ShenandoahBarrierSet::assembler()->cmpxchg_oop(masm->masm(), result, Address(addr, 0), cmpval, newval, false, tmp1, tmp2);
}
@@ -105,7 +102,7 @@ LIR_Opr ShenandoahBarrierSetC1::atomic_xchg_at_resolved(LIRAccess& access, LIRIt
// Because we want a 2-arg form of xchg and xadd
__ move(value_opr, result);
- assert(type == T_INT || is_reference_type(type) LP64_ONLY( || type == T_LONG ), "unexpected type");
+ assert(type == T_INT || is_reference_type(type) || type == T_LONG, "unexpected type");
__ xchg(access.resolved_addr(), result, result, LIR_OprFact::illegalOpr);
if (access.is_oop()) {
diff --git a/src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp b/src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp
index a33ec611f55cd..deb8111adade8 100644
--- a/src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp
+++ b/src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp
@@ -23,6 +23,8 @@
*
*/
+#include "gc/shenandoah/heuristics/shenandoahHeuristics.hpp"
+#include "gc/shenandoah/mode/shenandoahMode.hpp"
#include "gc/shenandoah/shenandoahBarrierSet.hpp"
#include "gc/shenandoah/shenandoahBarrierSetAssembler.hpp"
#include "gc/shenandoah/shenandoahForwarding.hpp"
@@ -30,8 +32,6 @@
#include "gc/shenandoah/shenandoahHeapRegion.hpp"
#include "gc/shenandoah/shenandoahRuntime.hpp"
#include "gc/shenandoah/shenandoahThreadLocalData.hpp"
-#include "gc/shenandoah/heuristics/shenandoahHeuristics.hpp"
-#include "gc/shenandoah/mode/shenandoahMode.hpp"
#include "interpreter/interpreter.hpp"
#include "runtime/javaThread.hpp"
#include "runtime/sharedRuntime.hpp"
@@ -51,63 +51,33 @@ static void save_machine_state(MacroAssembler* masm, bool handle_gpr, bool handl
if (handle_fp) {
// Some paths can be reached from the c2i adapter with live fp arguments in registers.
- LP64_ONLY(assert(Argument::n_float_register_parameters_j == 8, "8 fp registers to save at java call"));
-
- if (UseSSE >= 2) {
- const int xmm_size = wordSize * LP64_ONLY(2) NOT_LP64(4);
- __ subptr(rsp, xmm_size * 8);
- __ movdbl(Address(rsp, xmm_size * 0), xmm0);
- __ movdbl(Address(rsp, xmm_size * 1), xmm1);
- __ movdbl(Address(rsp, xmm_size * 2), xmm2);
- __ movdbl(Address(rsp, xmm_size * 3), xmm3);
- __ movdbl(Address(rsp, xmm_size * 4), xmm4);
- __ movdbl(Address(rsp, xmm_size * 5), xmm5);
- __ movdbl(Address(rsp, xmm_size * 6), xmm6);
- __ movdbl(Address(rsp, xmm_size * 7), xmm7);
- } else if (UseSSE >= 1) {
- const int xmm_size = wordSize * LP64_ONLY(1) NOT_LP64(2);
- __ subptr(rsp, xmm_size * 8);
- __ movflt(Address(rsp, xmm_size * 0), xmm0);
- __ movflt(Address(rsp, xmm_size * 1), xmm1);
- __ movflt(Address(rsp, xmm_size * 2), xmm2);
- __ movflt(Address(rsp, xmm_size * 3), xmm3);
- __ movflt(Address(rsp, xmm_size * 4), xmm4);
- __ movflt(Address(rsp, xmm_size * 5), xmm5);
- __ movflt(Address(rsp, xmm_size * 6), xmm6);
- __ movflt(Address(rsp, xmm_size * 7), xmm7);
- } else {
- __ push_FPU_state();
- }
+ assert(Argument::n_float_register_parameters_j == 8, "8 fp registers to save at java call");
+
+ const int xmm_size = wordSize * 2;
+ __ subptr(rsp, xmm_size * 8);
+ __ movdbl(Address(rsp, xmm_size * 0), xmm0);
+ __ movdbl(Address(rsp, xmm_size * 1), xmm1);
+ __ movdbl(Address(rsp, xmm_size * 2), xmm2);
+ __ movdbl(Address(rsp, xmm_size * 3), xmm3);
+ __ movdbl(Address(rsp, xmm_size * 4), xmm4);
+ __ movdbl(Address(rsp, xmm_size * 5), xmm5);
+ __ movdbl(Address(rsp, xmm_size * 6), xmm6);
+ __ movdbl(Address(rsp, xmm_size * 7), xmm7);
}
}
static void restore_machine_state(MacroAssembler* masm, bool handle_gpr, bool handle_fp) {
if (handle_fp) {
- if (UseSSE >= 2) {
- const int xmm_size = wordSize * LP64_ONLY(2) NOT_LP64(4);
- __ movdbl(xmm0, Address(rsp, xmm_size * 0));
- __ movdbl(xmm1, Address(rsp, xmm_size * 1));
- __ movdbl(xmm2, Address(rsp, xmm_size * 2));
- __ movdbl(xmm3, Address(rsp, xmm_size * 3));
- __ movdbl(xmm4, Address(rsp, xmm_size * 4));
- __ movdbl(xmm5, Address(rsp, xmm_size * 5));
- __ movdbl(xmm6, Address(rsp, xmm_size * 6));
- __ movdbl(xmm7, Address(rsp, xmm_size * 7));
- __ addptr(rsp, xmm_size * 8);
- } else if (UseSSE >= 1) {
- const int xmm_size = wordSize * LP64_ONLY(1) NOT_LP64(2);
- __ movflt(xmm0, Address(rsp, xmm_size * 0));
- __ movflt(xmm1, Address(rsp, xmm_size * 1));
- __ movflt(xmm2, Address(rsp, xmm_size * 2));
- __ movflt(xmm3, Address(rsp, xmm_size * 3));
- __ movflt(xmm4, Address(rsp, xmm_size * 4));
- __ movflt(xmm5, Address(rsp, xmm_size * 5));
- __ movflt(xmm6, Address(rsp, xmm_size * 6));
- __ movflt(xmm7, Address(rsp, xmm_size * 7));
- __ addptr(rsp, xmm_size * 8);
- } else {
- __ pop_FPU_state();
- }
+ const int xmm_size = wordSize * 2;
+ __ movdbl(xmm0, Address(rsp, xmm_size * 0));
+ __ movdbl(xmm1, Address(rsp, xmm_size * 1));
+ __ movdbl(xmm2, Address(rsp, xmm_size * 2));
+ __ movdbl(xmm3, Address(rsp, xmm_size * 3));
+ __ movdbl(xmm4, Address(rsp, xmm_size * 4));
+ __ movdbl(xmm5, Address(rsp, xmm_size * 5));
+ __ movdbl(xmm6, Address(rsp, xmm_size * 6));
+ __ movdbl(xmm7, Address(rsp, xmm_size * 7));
+ __ addptr(rsp, xmm_size * 8);
}
if (handle_gpr) {
@@ -124,11 +94,10 @@ void ShenandoahBarrierSetAssembler::arraycopy_prologue(MacroAssembler* masm, Dec
if (ShenandoahCardBarrier) {
bool checkcast = (decorators & ARRAYCOPY_CHECKCAST) != 0;
bool disjoint = (decorators & ARRAYCOPY_DISJOINT) != 0;
- bool obj_int = type == T_OBJECT LP64_ONLY(&& UseCompressedOops);
+ bool obj_int = (type == T_OBJECT) && UseCompressedOops;
// We need to save the original element count because the array copy stub
// will destroy the value and we need it for the card marking barrier.
-#ifdef _LP64
if (!checkcast) {
if (!obj_int) {
// Save count for barrier
@@ -138,30 +107,10 @@ void ShenandoahBarrierSetAssembler::arraycopy_prologue(MacroAssembler* masm, Dec
__ movq(r11, dst);
}
}
-#else
- if (disjoint) {
- __ mov(rdx, dst); // save 'to'
- }
-#endif
}
if ((ShenandoahSATBBarrier && !dest_uninitialized) || ShenandoahLoadRefBarrier) {
-#ifdef _LP64
Register thread = r15_thread;
-#else
- Register thread = rax;
- if (thread == src || thread == dst || thread == count) {
- thread = rbx;
- }
- if (thread == src || thread == dst || thread == count) {
- thread = rcx;
- }
- if (thread == src || thread == dst || thread == count) {
- thread = rdx;
- }
- __ push(thread);
- __ get_thread(thread);
-#endif
assert_different_registers(src, dst, count, thread);
Label L_done;
@@ -182,16 +131,13 @@ void ShenandoahBarrierSetAssembler::arraycopy_prologue(MacroAssembler* masm, Dec
save_machine_state(masm, /* handle_gpr = */ true, /* handle_fp = */ false);
-#ifdef _LP64
assert(src == rdi, "expected");
assert(dst == rsi, "expected");
assert(count == rdx, "expected");
if (UseCompressedOops) {
__ call_VM_leaf(CAST_FROM_FN_PTR(address, ShenandoahRuntime::arraycopy_barrier_narrow_oop),
src, dst, count);
- } else
-#endif
- {
+ } else {
__ call_VM_leaf(CAST_FROM_FN_PTR(address, ShenandoahRuntime::arraycopy_barrier_oop),
src, dst, count);
}
@@ -199,7 +145,6 @@ void ShenandoahBarrierSetAssembler::arraycopy_prologue(MacroAssembler* masm, Dec
restore_machine_state(masm, /* handle_gpr = */ true, /* handle_fp = */ false);
__ bind(L_done);
- NOT_LP64(__ pop(thread);)
}
}
@@ -211,10 +156,9 @@ void ShenandoahBarrierSetAssembler::arraycopy_epilogue(MacroAssembler* masm, Dec
if (ShenandoahCardBarrier && is_reference_type(type)) {
bool checkcast = (decorators & ARRAYCOPY_CHECKCAST) != 0;
bool disjoint = (decorators & ARRAYCOPY_DISJOINT) != 0;
- bool obj_int = type == T_OBJECT LP64_ONLY(&& UseCompressedOops);
+ bool obj_int = (type == T_OBJECT) && UseCompressedOops;
Register tmp = rax;
-#ifdef _LP64
if (!checkcast) {
if (!obj_int) {
// Save count for barrier
@@ -226,11 +170,6 @@ void ShenandoahBarrierSetAssembler::arraycopy_epilogue(MacroAssembler* masm, Dec
} else {
tmp = rscratch1;
}
-#else
- if (disjoint) {
- __ mov(dst, rdx); // restore 'to'
- }
-#endif
gen_write_ref_array_post_barrier(masm, decorators, dst, count, tmp);
}
}
@@ -238,20 +177,18 @@ void ShenandoahBarrierSetAssembler::arraycopy_epilogue(MacroAssembler* masm, Dec
void ShenandoahBarrierSetAssembler::shenandoah_write_barrier_pre(MacroAssembler* masm,
Register obj,
Register pre_val,
- Register thread,
Register tmp,
bool tosca_live,
bool expand_call) {
if (ShenandoahSATBBarrier) {
- satb_write_barrier_pre(masm, obj, pre_val, thread, tmp, tosca_live, expand_call);
+ satb_write_barrier_pre(masm, obj, pre_val, tmp, tosca_live, expand_call);
}
}
void ShenandoahBarrierSetAssembler::satb_write_barrier_pre(MacroAssembler* masm,
Register obj,
Register pre_val,
- Register thread,
Register tmp,
bool tosca_live,
bool expand_call) {
@@ -259,9 +196,7 @@ void ShenandoahBarrierSetAssembler::satb_write_barrier_pre(MacroAssembler* masm,
// directly to skip generating the check by
// InterpreterMacroAssembler::call_VM_leaf_base that checks _last_sp.
-#ifdef _LP64
- assert(thread == r15_thread, "must be");
-#endif // _LP64
+ const Register thread = r15_thread;
Label done;
Label runtime;
@@ -282,7 +217,7 @@ void ShenandoahBarrierSetAssembler::satb_write_barrier_pre(MacroAssembler* masm,
// Do we need to load the previous value?
if (obj != noreg) {
- __ load_heap_oop(pre_val, Address(obj, 0), noreg, noreg, AS_RAW);
+ __ load_heap_oop(pre_val, Address(obj, 0), noreg, AS_RAW);
}
// Is the previous value null?
@@ -327,9 +262,6 @@ void ShenandoahBarrierSetAssembler::satb_write_barrier_pre(MacroAssembler* masm,
// So when we do not have have a full interpreter frame on the stack
// expand_call should be passed true.
- NOT_LP64( __ push(thread); )
-
-#ifdef _LP64
// We move pre_val into c_rarg0 early, in order to avoid smashing it, should
// pre_val be c_rarg1 (where the call prologue would copy thread argument).
// Note: this should not accidentally smash thread, because thread is always r15.
@@ -337,26 +269,18 @@ void ShenandoahBarrierSetAssembler::satb_write_barrier_pre(MacroAssembler* masm,
if (c_rarg0 != pre_val) {
__ mov(c_rarg0, pre_val);
}
-#endif
if (expand_call) {
- LP64_ONLY( assert(pre_val != c_rarg1, "smashed arg"); )
-#ifdef _LP64
+ assert(pre_val != c_rarg1, "smashed arg");
if (c_rarg1 != thread) {
__ mov(c_rarg1, thread);
}
// Already moved pre_val into c_rarg0 above
-#else
- __ push(thread);
- __ push(pre_val);
-#endif
__ MacroAssembler::call_VM_leaf_base(CAST_FROM_FN_PTR(address, ShenandoahRuntime::write_ref_field_pre), 2);
} else {
- __ call_VM_leaf(CAST_FROM_FN_PTR(address, ShenandoahRuntime::write_ref_field_pre), LP64_ONLY(c_rarg0) NOT_LP64(pre_val), thread);
+ __ call_VM_leaf(CAST_FROM_FN_PTR(address, ShenandoahRuntime::write_ref_field_pre), c_rarg0, thread);
}
- NOT_LP64( __ pop(thread); )
-
// save the live input values
if (pre_val != rax)
__ pop(pre_val);
@@ -383,16 +307,7 @@ void ShenandoahBarrierSetAssembler::load_reference_barrier(MacroAssembler* masm,
__ block_comment("load_reference_barrier { ");
// Check if GC is active
-#ifdef _LP64
Register thread = r15_thread;
-#else
- Register thread = rcx;
- if (thread == dst) {
- thread = rbx;
- }
- __ push(thread);
- __ get_thread(thread);
-#endif
Address gc_state(thread, in_bytes(ShenandoahThreadLocalData::gc_state_offset()));
int flags = ShenandoahHeap::HAS_FORWARDED;
@@ -438,7 +353,7 @@ void ShenandoahBarrierSetAssembler::load_reference_barrier(MacroAssembler* masm,
// The rest is saved with the optimized path
- uint num_saved_regs = 4 + (dst != rax ? 1 : 0) LP64_ONLY(+4);
+ uint num_saved_regs = 4 + (dst != rax ? 1 : 0) + 4;
__ subptr(rsp, num_saved_regs * wordSize);
uint slot = num_saved_regs;
if (dst != rax) {
@@ -448,21 +363,15 @@ void ShenandoahBarrierSetAssembler::load_reference_barrier(MacroAssembler* masm,
__ movptr(Address(rsp, (--slot) * wordSize), rdx);
__ movptr(Address(rsp, (--slot) * wordSize), rdi);
__ movptr(Address(rsp, (--slot) * wordSize), rsi);
-#ifdef _LP64
__ movptr(Address(rsp, (--slot) * wordSize), r8);
__ movptr(Address(rsp, (--slot) * wordSize), r9);
__ movptr(Address(rsp, (--slot) * wordSize), r10);
__ movptr(Address(rsp, (--slot) * wordSize), r11);
// r12-r15 are callee saved in all calling conventions
-#endif
assert(slot == 0, "must use all slots");
// Shuffle registers such that dst is in c_rarg0 and addr in c_rarg1.
-#ifdef _LP64
Register arg0 = c_rarg0, arg1 = c_rarg1;
-#else
- Register arg0 = rdi, arg1 = rsi;
-#endif
if (dst == arg1) {
__ lea(arg0, src);
__ xchgptr(arg1, arg0);
@@ -489,12 +398,10 @@ void ShenandoahBarrierSetAssembler::load_reference_barrier(MacroAssembler* masm,
__ super_call_VM_leaf(CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_phantom), arg0, arg1);
}
-#ifdef _LP64
__ movptr(r11, Address(rsp, (slot++) * wordSize));
__ movptr(r10, Address(rsp, (slot++) * wordSize));
__ movptr(r9, Address(rsp, (slot++) * wordSize));
__ movptr(r8, Address(rsp, (slot++) * wordSize));
-#endif
__ movptr(rsi, Address(rsp, (slot++) * wordSize));
__ movptr(rdi, Address(rsp, (slot++) * wordSize));
__ movptr(rdx, Address(rsp, (slot++) * wordSize));
@@ -520,10 +427,6 @@ void ShenandoahBarrierSetAssembler::load_reference_barrier(MacroAssembler* masm,
__ bind(heap_stable);
__ block_comment("} load_reference_barrier");
-
-#ifndef _LP64
- __ pop(thread);
-#endif
}
//
@@ -540,10 +443,10 @@ void ShenandoahBarrierSetAssembler::load_reference_barrier(MacroAssembler* masm,
// tmp1 (if it is valid)
//
void ShenandoahBarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type,
- Register dst, Address src, Register tmp1, Register tmp_thread) {
+ Register dst, Address src, Register tmp1) {
// 1: non-reference load, no additional barrier is needed
if (!is_reference_type(type)) {
- BarrierSetAssembler::load_at(masm, decorators, type, dst, src, tmp1, tmp_thread);
+ BarrierSetAssembler::load_at(masm, decorators, type, dst, src, tmp1);
return;
}
@@ -567,7 +470,7 @@ void ShenandoahBarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet d
assert_different_registers(dst, src.base(), src.index());
}
- BarrierSetAssembler::load_at(masm, decorators, type, dst, src, tmp1, tmp_thread);
+ BarrierSetAssembler::load_at(masm, decorators, type, dst, src, tmp1);
load_reference_barrier(masm, dst, src, decorators);
@@ -582,25 +485,19 @@ void ShenandoahBarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet d
dst = result_dst;
}
} else {
- BarrierSetAssembler::load_at(masm, decorators, type, dst, src, tmp1, tmp_thread);
+ BarrierSetAssembler::load_at(masm, decorators, type, dst, src, tmp1);
}
// 3: apply keep-alive barrier if needed
if (ShenandoahBarrierSet::need_keep_alive_barrier(decorators, type)) {
save_machine_state(masm, /* handle_gpr = */ true, /* handle_fp = */ true);
- Register thread = NOT_LP64(tmp_thread) LP64_ONLY(r15_thread);
- assert_different_registers(dst, tmp1, tmp_thread);
- if (!thread->is_valid()) {
- thread = rdx;
- }
- NOT_LP64(__ get_thread(thread));
+ assert_different_registers(dst, tmp1, r15_thread);
// Generate the SATB pre-barrier code to log the value of
// the referent field in an SATB buffer.
shenandoah_write_barrier_pre(masm /* masm */,
noreg /* obj */,
dst /* pre_val */,
- thread /* thread */,
tmp1 /* tmp */,
true /* tosca_live */,
true /* expand_call */);
@@ -618,23 +515,8 @@ void ShenandoahBarrierSetAssembler::store_check(MacroAssembler* masm, Register o
// We'll use this register as the TLS base address and also later on
// to hold the byte_map_base.
- Register thread = LP64_ONLY(r15_thread) NOT_LP64(rcx);
- Register tmp = LP64_ONLY(rscratch1) NOT_LP64(rdx);
-
-#ifndef _LP64
- // The next two ifs are just to get temporary registers to use for TLS and card table base.
- if (thread == obj) {
- thread = rdx;
- tmp = rsi;
- }
- if (tmp == obj) {
- tmp = rsi;
- }
-
- __ push(thread);
- __ push(tmp);
- __ get_thread(thread);
-#endif
+ Register thread = r15_thread;
+ Register tmp = rscratch1;
Address curr_ct_holder_addr(thread, in_bytes(ShenandoahThreadLocalData::card_table_offset()));
__ movptr(tmp, curr_ct_holder_addr);
@@ -650,11 +532,6 @@ void ShenandoahBarrierSetAssembler::store_check(MacroAssembler* masm, Register o
} else {
__ movb(card_addr, dirty);
}
-
-#ifndef _LP64
- __ pop(tmp);
- __ pop(thread);
-#endif
}
void ShenandoahBarrierSetAssembler::store_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type,
@@ -666,7 +543,6 @@ void ShenandoahBarrierSetAssembler::store_at(MacroAssembler* masm, DecoratorSet
if (on_oop && in_heap) {
bool needs_pre_barrier = as_normal;
- Register rthread = LP64_ONLY(r15_thread) NOT_LP64(rcx);
// flatten object address if needed
// We do it regardless of precise because we need the registers
if (dst.index() == noreg && dst.disp() == 0) {
@@ -677,19 +553,12 @@ void ShenandoahBarrierSetAssembler::store_at(MacroAssembler* masm, DecoratorSet
__ lea(tmp1, dst);
}
- assert_different_registers(val, tmp1, tmp2, tmp3, rthread);
-
-#ifndef _LP64
- __ get_thread(rthread);
- InterpreterMacroAssembler *imasm = static_cast(masm);
- imasm->save_bcp();
-#endif
+ assert_different_registers(val, tmp1, tmp2, tmp3, r15_thread);
if (needs_pre_barrier) {
shenandoah_write_barrier_pre(masm /*masm*/,
tmp1 /* obj */,
tmp2 /* pre_val */,
- rthread /* thread */,
tmp3 /* tmp */,
val != noreg /* tosca_live */,
false /* expand_call */);
@@ -701,7 +570,6 @@ void ShenandoahBarrierSetAssembler::store_at(MacroAssembler* masm, DecoratorSet
store_check(masm, tmp1);
}
}
- NOT_LP64(imasm->restore_bcp());
} else {
BarrierSetAssembler::store_at(masm, decorators, type, dst, val, tmp1, tmp2, tmp3);
}
@@ -736,12 +604,9 @@ void ShenandoahBarrierSetAssembler::cmpxchg_oop(MacroAssembler* masm,
Label L_success, L_failure;
// Remember oldval for retry logic below
-#ifdef _LP64
if (UseCompressedOops) {
__ movl(tmp1, oldval);
- } else
-#endif
- {
+ } else {
__ movptr(tmp1, oldval);
}
@@ -749,13 +614,10 @@ void ShenandoahBarrierSetAssembler::cmpxchg_oop(MacroAssembler* masm,
//
// Try to CAS with given arguments. If successful, then we are done.
-#ifdef _LP64
if (UseCompressedOops) {
__ lock();
__ cmpxchgl(newval, addr);
- } else
-#endif
- {
+ } else {
__ lock();
__ cmpxchgptr(newval, addr);
}
@@ -776,23 +638,15 @@ void ShenandoahBarrierSetAssembler::cmpxchg_oop(MacroAssembler* masm,
__ jcc(Assembler::zero, L_failure);
// Filter: when heap is stable, the failure is definitely legitimate
-#ifdef _LP64
const Register thread = r15_thread;
-#else
- const Register thread = tmp2;
- __ get_thread(thread);
-#endif
Address gc_state(thread, in_bytes(ShenandoahThreadLocalData::gc_state_offset()));
__ testb(gc_state, ShenandoahHeap::HAS_FORWARDED);
__ jcc(Assembler::zero, L_failure);
-#ifdef _LP64
if (UseCompressedOops) {
__ movl(tmp2, oldval);
__ decode_heap_oop(tmp2);
- } else
-#endif
- {
+ } else {
__ movptr(tmp2, oldval);
}
@@ -807,11 +661,9 @@ void ShenandoahBarrierSetAssembler::cmpxchg_oop(MacroAssembler* masm,
__ shrptr(tmp2, 2);
__ shlptr(tmp2, 2);
-#ifdef _LP64
if (UseCompressedOops) {
__ decode_heap_oop(tmp1); // decode for comparison
}
-#endif
// Now we have the forwarded offender in tmp2.
// Compare and if they don't match, we have legitimate failure
@@ -827,19 +679,14 @@ void ShenandoahBarrierSetAssembler::cmpxchg_oop(MacroAssembler* masm,
// with to-space ptr store. We still have to do the retry, because the GC might
// have updated the reference for us.
-#ifdef _LP64
if (UseCompressedOops) {
__ encode_heap_oop(tmp2); // previously decoded at step 2.
}
-#endif
-#ifdef _LP64
if (UseCompressedOops) {
__ lock();
__ cmpxchgl(tmp2, addr);
- } else
-#endif
- {
+ } else {
__ lock();
__ cmpxchgptr(tmp2, addr);
}
@@ -851,22 +698,16 @@ void ShenandoahBarrierSetAssembler::cmpxchg_oop(MacroAssembler* masm,
// from-space ptr into memory anymore. Make sure oldval is restored, after being
// garbled during retries.
//
-#ifdef _LP64
if (UseCompressedOops) {
__ movl(oldval, tmp2);
- } else
-#endif
- {
+ } else {
__ movptr(oldval, tmp2);
}
-#ifdef _LP64
if (UseCompressedOops) {
__ lock();
__ cmpxchgl(newval, addr);
- } else
-#endif
- {
+ } else {
__ lock();
__ cmpxchgptr(newval, addr);
}
@@ -918,7 +759,6 @@ void ShenandoahBarrierSetAssembler::gen_write_ref_array_post_barrier(MacroAssemb
__ testl(count, count);
__ jccb(Assembler::zero, L_done);
-#ifdef _LP64
const Register thread = r15_thread;
Address curr_ct_holder_addr(thread, in_bytes(ShenandoahThreadLocalData::card_table_offset()));
__ movptr(tmp, curr_ct_holder_addr);
@@ -935,26 +775,6 @@ void ShenandoahBarrierSetAssembler::gen_write_ref_array_post_barrier(MacroAssemb
__ movb(Address(addr, count, Address::times_1), 0);
__ decrement(count);
__ jccb(Assembler::greaterEqual, L_loop);
-#else
- const Register thread = tmp;
- __ get_thread(thread);
-
- Address curr_ct_holder_addr(thread, in_bytes(ShenandoahThreadLocalData::byte_map_base_offset()));
- __ movptr(tmp, curr_ct_holder_addr);
-
- __ lea(end, Address(addr, count, Address::times_ptr, -wordSize));
- __ shrptr(addr, CardTable::card_shift());
- __ shrptr(end, CardTable::card_shift());
- __ subptr(end, addr); // end --> count
-
- __ addptr(addr, tmp);
-
- __ BIND(L_loop);
- Address cardtable(addr, count, Address::times_1, 0);
- __ movb(cardtable, 0);
- __ decrement(count);
- __ jccb(Assembler::greaterEqual, L_loop);
-#endif
__ BIND(L_done);
}
@@ -1019,15 +839,8 @@ void ShenandoahBarrierSetAssembler::gen_load_reference_barrier_stub(LIR_Assemble
__ mov(tmp1, res);
__ shrptr(tmp1, ShenandoahHeapRegion::region_size_bytes_shift_jint());
__ movptr(tmp2, (intptr_t) ShenandoahHeap::in_cset_fast_test_addr());
-#ifdef _LP64
__ movbool(tmp2, Address(tmp2, tmp1, Address::times_1));
__ testbool(tmp2);
-#else
- // On x86_32, C1 register allocator can give us the register without 8-bit support.
- // Do the full-register access and test to avoid compilation failures.
- __ movptr(tmp2, Address(tmp2, tmp1, Address::times_1));
- __ testptr(tmp2, 0xFF);
-#endif
__ jcc(Assembler::zero, *stub->continuation());
}
@@ -1061,11 +874,9 @@ void ShenandoahBarrierSetAssembler::generate_c1_pre_barrier_runtime_stub(StubAss
__ push(rdx);
const Register pre_val = rax;
- const Register thread = NOT_LP64(rax) LP64_ONLY(r15_thread);
+ const Register thread = r15_thread;
const Register tmp = rdx;
- NOT_LP64(__ get_thread(thread);)
-
Address queue_index(thread, in_bytes(ShenandoahThreadLocalData::satb_mark_queue_index_offset()));
Address buffer(thread, in_bytes(ShenandoahThreadLocalData::satb_mark_queue_buffer_offset()));
@@ -1120,7 +931,6 @@ void ShenandoahBarrierSetAssembler::generate_c1_load_reference_barrier_runtime_s
bool is_phantom = ShenandoahBarrierSet::is_phantom_access(decorators);
bool is_native = ShenandoahBarrierSet::is_native_access(decorators);
-#ifdef _LP64
__ load_parameter(0, c_rarg0);
__ load_parameter(1, c_rarg1);
if (is_strong) {
@@ -1145,18 +955,6 @@ void ShenandoahBarrierSetAssembler::generate_c1_load_reference_barrier_runtime_s
assert(is_native, "phantom must only be called off-heap");
__ call_VM_leaf(CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_phantom), c_rarg0, c_rarg1);
}
-#else
- __ load_parameter(0, rax);
- __ load_parameter(1, rbx);
- if (is_strong) {
- __ call_VM_leaf(CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_strong), rax, rbx);
- } else if (is_weak) {
- __ call_VM_leaf(CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_weak), rax, rbx);
- } else {
- assert(is_phantom, "only remaining strength");
- __ call_VM_leaf(CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_phantom), rax, rbx);
- }
-#endif
__ restore_live_registers_except_rax(true);
diff --git a/src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.hpp b/src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.hpp
index ae0ad3533e146..b0185f2dbffbd 100644
--- a/src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.hpp
+++ b/src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.hpp
@@ -44,7 +44,6 @@ class ShenandoahBarrierSetAssembler: public BarrierSetAssembler {
void satb_write_barrier_pre(MacroAssembler* masm,
Register obj,
Register pre_val,
- Register thread,
Register tmp,
bool tosca_live,
bool expand_call);
@@ -52,7 +51,6 @@ class ShenandoahBarrierSetAssembler: public BarrierSetAssembler {
void shenandoah_write_barrier_pre(MacroAssembler* masm,
Register obj,
Register pre_val,
- Register thread,
Register tmp,
bool tosca_live,
bool expand_call);
@@ -81,7 +79,7 @@ class ShenandoahBarrierSetAssembler: public BarrierSetAssembler {
virtual void arraycopy_epilogue(MacroAssembler* masm, DecoratorSet decorators, BasicType type,
Register src, Register dst, Register count);
virtual void load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type,
- Register dst, Address src, Register tmp1, Register tmp_thread);
+ Register dst, Address src, Register tmp1);
virtual void store_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type,
Address dst, Register val, Register tmp1, Register tmp2, Register tmp3);
virtual void try_resolve_jobject_in_native(MacroAssembler* masm, Register jni_env,
diff --git a/src/hotspot/cpu/x86/gc/z/zAddress_x86.cpp b/src/hotspot/cpu/x86/gc/z/zAddress_x86.cpp
index 3667a52050c7a..6b5b64d30367f 100644
--- a/src/hotspot/cpu/x86/gc/z/zAddress_x86.cpp
+++ b/src/hotspot/cpu/x86/gc/z/zAddress_x86.cpp
@@ -32,7 +32,7 @@ size_t ZPointerLoadShift;
size_t ZPlatformAddressOffsetBits() {
const size_t min_address_offset_bits = 42; // 4TB
const size_t max_address_offset_bits = 44; // 16TB
- const size_t address_offset = round_up_power_of_2(MaxHeapSize * ZVirtualToPhysicalRatio);
+ const size_t address_offset = ZGlobalsPointers::min_address_offset_request();
const size_t address_offset_bits = log2i_exact(address_offset);
return clamp(address_offset_bits, min_address_offset_bits, max_address_offset_bits);
}
diff --git a/src/hotspot/cpu/x86/gc/z/zBarrierSetAssembler_x86.cpp b/src/hotspot/cpu/x86/gc/z/zBarrierSetAssembler_x86.cpp
index f7b1e25cf3b5d..4a956b450bdc9 100644
--- a/src/hotspot/cpu/x86/gc/z/zBarrierSetAssembler_x86.cpp
+++ b/src/hotspot/cpu/x86/gc/z/zBarrierSetAssembler_x86.cpp
@@ -220,11 +220,10 @@ void ZBarrierSetAssembler::load_at(MacroAssembler* masm,
BasicType type,
Register dst,
Address src,
- Register tmp1,
- Register tmp_thread) {
+ Register tmp1) {
if (!ZBarrierSet::barrier_needed(decorators, type)) {
// Barrier not needed
- BarrierSetAssembler::load_at(masm, decorators, type, dst, src, tmp1, tmp_thread);
+ BarrierSetAssembler::load_at(masm, decorators, type, dst, src, tmp1);
return;
}
@@ -1329,7 +1328,13 @@ void ZBarrierSetAssembler::patch_barrier_relocation(address addr, int format) {
const uint16_t value = patch_barrier_relocation_value(format);
uint8_t* const patch_addr = (uint8_t*)addr + offset;
if (format == ZBarrierRelocationFormatLoadGoodBeforeShl) {
- *patch_addr = (uint8_t)value;
+ if (VM_Version::supports_apx_f()) {
+ NativeInstruction* instruction = nativeInstruction_at(addr);
+ uint8_t* const rex2_patch_addr = patch_addr + (instruction->has_rex2_prefix() ? 1 : 0);
+ *rex2_patch_addr = (uint8_t)value;
+ } else {
+ *patch_addr = (uint8_t)value;
+ }
} else {
*(uint16_t*)patch_addr = value;
}
diff --git a/src/hotspot/cpu/x86/gc/z/zBarrierSetAssembler_x86.hpp b/src/hotspot/cpu/x86/gc/z/zBarrierSetAssembler_x86.hpp
index 91be2e3b94585..8bb653ec5fbaf 100644
--- a/src/hotspot/cpu/x86/gc/z/zBarrierSetAssembler_x86.hpp
+++ b/src/hotspot/cpu/x86/gc/z/zBarrierSetAssembler_x86.hpp
@@ -73,8 +73,7 @@ class ZBarrierSetAssembler : public ZBarrierSetAssemblerBase {
BasicType type,
Register dst,
Address src,
- Register tmp1,
- Register tmp_thread);
+ Register tmp1);
virtual void store_at(MacroAssembler* masm,
DecoratorSet decorators,
diff --git a/src/hotspot/cpu/x86/globalDefinitions_x86.hpp b/src/hotspot/cpu/x86/globalDefinitions_x86.hpp
index 873cfbdcea0ec..3c1474ae8611a 100644
--- a/src/hotspot/cpu/x86/globalDefinitions_x86.hpp
+++ b/src/hotspot/cpu/x86/globalDefinitions_x86.hpp
@@ -34,9 +34,7 @@ const bool CCallingConventionRequiresIntsAsLongs = false;
#define SUPPORTS_NATIVE_CX8
-#ifdef _LP64
#define SUPPORT_MONITOR_COUNT
-#endif
#define CPU_MULTI_COPY_ATOMIC
@@ -44,15 +42,11 @@ const bool CCallingConventionRequiresIntsAsLongs = false;
#define DEFAULT_CACHE_LINE_SIZE 64
// The default padding size for data structures to avoid false sharing.
-#ifdef _LP64
// The common wisdom is that adjacent cache line prefetchers on some hardware
// may pull two cache lines on access, so we have to pessimistically assume twice
// the cache line size for padding. TODO: Check if this is still true for modern
// hardware. If not, DEFAULT_CACHE_LINE_SIZE might as well suffice.
#define DEFAULT_PADDING_SIZE (DEFAULT_CACHE_LINE_SIZE*2)
-#else
-#define DEFAULT_PADDING_SIZE DEFAULT_CACHE_LINE_SIZE
-#endif
#if defined(LINUX) || defined(__APPLE__)
#define SUPPORT_RESERVED_STACK_AREA
diff --git a/src/hotspot/cpu/x86/globals_x86.hpp b/src/hotspot/cpu/x86/globals_x86.hpp
index 54888a9f849d9..a1d4a71874f55 100644
--- a/src/hotspot/cpu/x86/globals_x86.hpp
+++ b/src/hotspot/cpu/x86/globals_x86.hpp
@@ -61,29 +61,19 @@ define_pd_global(intx, InlineSmallCode, 1000);
#define MIN_STACK_RED_PAGES DEFAULT_STACK_RED_PAGES
#define MIN_STACK_RESERVED_PAGES (0)
-#ifdef _LP64
// Java_java_net_SocketOutputStream_socketWrite0() uses a 64k buffer on the
-// stack if compiled for unix and LP64. To pass stack overflow tests we need
-// 20 shadow pages.
+// stack if compiled for unix. To pass stack overflow tests we need 20 shadow pages.
#define DEFAULT_STACK_SHADOW_PAGES (NOT_WIN64(20) WIN64_ONLY(8) DEBUG_ONLY(+4))
// For those clients that do not use write socket, we allow
// the min range value to be below that of the default
#define MIN_STACK_SHADOW_PAGES (NOT_WIN64(10) WIN64_ONLY(8) DEBUG_ONLY(+4))
-#else
-#define DEFAULT_STACK_SHADOW_PAGES (4 DEBUG_ONLY(+5))
-#define MIN_STACK_SHADOW_PAGES DEFAULT_STACK_SHADOW_PAGES
-#endif // _LP64
define_pd_global(intx, StackYellowPages, DEFAULT_STACK_YELLOW_PAGES);
define_pd_global(intx, StackRedPages, DEFAULT_STACK_RED_PAGES);
define_pd_global(intx, StackShadowPages, DEFAULT_STACK_SHADOW_PAGES);
define_pd_global(intx, StackReservedPages, DEFAULT_STACK_RESERVED_PAGES);
-#ifdef _LP64
define_pd_global(bool, VMContinuations, true);
-#else
-define_pd_global(bool, VMContinuations, false);
-#endif
define_pd_global(bool, RewriteBytecodes, true);
define_pd_global(bool, RewriteFrequentPairs, true);
@@ -191,6 +181,15 @@ define_pd_global(intx, InitArrayShortSize, 8*BytesPerLong);
product(bool, IntelJccErratumMitigation, true, DIAGNOSTIC, \
"Turn off JVM mitigations related to Intel micro code " \
"mitigations for the Intel JCC erratum") \
+ \
+ product(int, X86ICacheSync, -1, DIAGNOSTIC, \
+ "Select the X86 ICache sync mechanism: -1 = auto-select; " \
+ "0 = none (dangerous); 1 = CLFLUSH loop; 2 = CLFLUSHOPT loop; "\
+ "3 = CLWB loop; 4 = single CPUID; 5 = single SERIALIZE. " \
+ "Explicitly selected mechanism will fail at startup if " \
+ "hardware does not support it.") \
+ range(-1, 5) \
+ \
// end of ARCH_FLAGS
#endif // CPU_X86_GLOBALS_X86_HPP
diff --git a/src/hotspot/cpu/x86/icache_x86.cpp b/src/hotspot/cpu/x86/icache_x86.cpp
index 45679332ecaca..889cfb32931e6 100644
--- a/src/hotspot/cpu/x86/icache_x86.cpp
+++ b/src/hotspot/cpu/x86/icache_x86.cpp
@@ -23,15 +23,63 @@
*/
#include "asm/macroAssembler.hpp"
+#include "runtime/flags/flagSetting.hpp"
+#include "runtime/globals_extension.hpp"
#include "runtime/icache.hpp"
#define __ _masm->
+void x86_generate_icache_fence(MacroAssembler* _masm) {
+ switch (X86ICacheSync) {
+ case 0:
+ break;
+ case 1:
+ __ mfence();
+ break;
+ case 2:
+ case 3:
+ __ sfence();
+ break;
+ case 4:
+ __ push(rax);
+ __ push(rbx);
+ __ push(rcx);
+ __ push(rdx);
+ __ xorptr(rax, rax);
+ __ cpuid();
+ __ pop(rdx);
+ __ pop(rcx);
+ __ pop(rbx);
+ __ pop(rax);
+ break;
+ case 5:
+ __ serialize();
+ break;
+ default:
+ ShouldNotReachHere();
+ }
+}
+
+void x86_generate_icache_flush_insn(MacroAssembler* _masm, Register addr) {
+ switch (X86ICacheSync) {
+ case 1:
+ __ clflush(Address(addr, 0));
+ break;
+ case 2:
+ __ clflushopt(Address(addr, 0));
+ break;
+ case 3:
+ __ clwb(Address(addr, 0));
+ break;
+ default:
+ ShouldNotReachHere();
+ }
+}
+
void ICacheStubGenerator::generate_icache_flush(ICache::flush_icache_stub_t* flush_icache_stub) {
- StubCodeMark mark(this, "ICache", "flush_icache_stub");
+ StubCodeMark mark(this, "ICache", _stub_name);
address start = __ pc();
-#ifdef AMD64
const Register addr = c_rarg0;
const Register lines = c_rarg1;
@@ -40,26 +88,22 @@ void ICacheStubGenerator::generate_icache_flush(ICache::flush_icache_stub_t* flu
Label flush_line, done;
__ testl(lines, lines);
- __ jcc(Assembler::zero, done);
+ __ jccb(Assembler::zero, done);
- // Force ordering wrt cflush.
- // Other fence and sync instructions won't do the job.
- __ mfence();
+ x86_generate_icache_fence(_masm);
- __ bind(flush_line);
- __ clflush(Address(addr, 0));
- __ addptr(addr, ICache::line_size);
- __ decrementl(lines);
- __ jcc(Assembler::notZero, flush_line);
+ if (1 <= X86ICacheSync && X86ICacheSync <= 3) {
+ __ bind(flush_line);
+ x86_generate_icache_flush_insn(_masm, addr);
+ __ addptr(addr, ICache::line_size);
+ __ decrementl(lines);
+ __ jccb(Assembler::notZero, flush_line);
- __ mfence();
+ x86_generate_icache_fence(_masm);
+ }
__ bind(done);
-#else
- const Address magic(rsp, 3*wordSize);
- __ lock(); __ addl(Address(rsp, 0), 0);
-#endif // AMD64
__ movptr(rax, magic); // Handshake with caller to make sure it happened!
__ ret(0);
@@ -67,4 +111,22 @@ void ICacheStubGenerator::generate_icache_flush(ICache::flush_icache_stub_t* flu
*flush_icache_stub = (ICache::flush_icache_stub_t)start;
}
+void ICache::initialize(int phase) {
+ switch (phase) {
+ case 1: {
+ // Initial phase, we assume only CLFLUSH is available.
+ IntFlagSetting fs(X86ICacheSync, 1);
+ AbstractICache::initialize(phase);
+ break;
+ }
+ case 2: {
+ // Final phase, generate the stub again.
+ AbstractICache::initialize(phase);
+ break;
+ }
+ default:
+ ShouldNotReachHere();
+ }
+}
+
#undef __
diff --git a/src/hotspot/cpu/x86/icache_x86.hpp b/src/hotspot/cpu/x86/icache_x86.hpp
index 48286a7e3b385..805022fbb3225 100644
--- a/src/hotspot/cpu/x86/icache_x86.hpp
+++ b/src/hotspot/cpu/x86/icache_x86.hpp
@@ -40,21 +40,13 @@
class ICache : public AbstractICache {
public:
-#ifdef AMD64
enum {
stub_size = 64, // Size of the icache flush stub in bytes
line_size = 64, // Icache line size in bytes
log2_line_size = 6 // log2(line_size)
};
- // Use default implementation
-#else
- enum {
- stub_size = 16, // Size of the icache flush stub in bytes
- line_size = BytesPerWord, // conservative
- log2_line_size = LogBytesPerWord // log2(line_size)
- };
-#endif // AMD64
+ static void initialize(int phase);
};
#endif // CPU_X86_ICACHE_X86_HPP
diff --git a/src/hotspot/cpu/x86/interp_masm_x86.cpp b/src/hotspot/cpu/x86/interp_masm_x86.cpp
index 84a99060a3efa..d982495d883df 100644
--- a/src/hotspot/cpu/x86/interp_masm_x86.cpp
+++ b/src/hotspot/cpu/x86/interp_masm_x86.cpp
@@ -296,7 +296,6 @@ void InterpreterMacroAssembler::call_VM_leaf_base(address entry_point,
}
void InterpreterMacroAssembler::call_VM_base(Register oop_result,
- Register java_thread,
Register last_java_sp,
address entry_point,
int number_of_arguments,
@@ -319,7 +318,7 @@ void InterpreterMacroAssembler::call_VM_base(Register oop_result,
}
#endif /* ASSERT */
// super call
- MacroAssembler::call_VM_base(oop_result, noreg, last_java_sp,
+ MacroAssembler::call_VM_base(oop_result, last_java_sp,
entry_point, number_of_arguments,
check_exceptions);
// interpreter specific
@@ -379,7 +378,7 @@ void InterpreterMacroAssembler::restore_after_resume(bool is_native) {
}
}
-void InterpreterMacroAssembler::check_and_handle_popframe(Register java_thread) {
+void InterpreterMacroAssembler::check_and_handle_popframe() {
if (JvmtiExport::can_pop_frame()) {
Label L;
// Initiate popframe handling only if it is not already being
@@ -389,7 +388,7 @@ void InterpreterMacroAssembler::check_and_handle_popframe(Register java_thread)
// This method is only called just after the call into the vm in
// call_VM_base, so the arg registers are available.
Register pop_cond = c_rarg0;
- movl(pop_cond, Address(java_thread, JavaThread::popframe_condition_offset()));
+ movl(pop_cond, Address(r15_thread, JavaThread::popframe_condition_offset()));
testl(pop_cond, JavaThread::popframe_pending_bit);
jcc(Assembler::zero, L);
testl(pop_cond, JavaThread::popframe_processing_bit);
@@ -418,8 +417,8 @@ void InterpreterMacroAssembler::load_earlyret_value(TosState state) {
case ctos: // fall through
case stos: // fall through
case itos: movl(rax, val_addr); break;
- case ftos: load_float(val_addr); break;
- case dtos: load_double(val_addr); break;
+ case ftos: movflt(xmm0, val_addr); break;
+ case dtos: movdbl(xmm0, val_addr); break;
case vtos: /* nothing to do */ break;
default : ShouldNotReachHere();
}
@@ -430,7 +429,7 @@ void InterpreterMacroAssembler::load_earlyret_value(TosState state) {
}
-void InterpreterMacroAssembler::check_and_handle_earlyret(Register java_thread) {
+void InterpreterMacroAssembler::check_and_handle_earlyret() {
if (JvmtiExport::can_force_early_return()) {
Label L;
Register tmp = c_rarg0;
@@ -810,13 +809,13 @@ void InterpreterMacroAssembler::remove_activation(
// the stack, will call InterpreterRuntime::at_unwind.
Label slow_path;
Label fast_path;
- safepoint_poll(slow_path, rthread, true /* at_return */, false /* in_nmethod */);
+ safepoint_poll(slow_path, true /* at_return */, false /* in_nmethod */);
jmp(fast_path);
bind(slow_path);
push(state);
- set_last_Java_frame(rthread, noreg, rbp, (address)pc(), rscratch1);
+ set_last_Java_frame(noreg, rbp, (address)pc(), rscratch1);
super_call_VM_leaf(CAST_FROM_FN_PTR(address, InterpreterRuntime::at_unwind), rthread);
- reset_last_Java_frame(rthread, true);
+ reset_last_Java_frame(true);
pop(state);
bind(fast_path);
@@ -1024,16 +1023,15 @@ void InterpreterMacroAssembler::lock_object(Register lock_reg) {
// Load object pointer into obj_reg
movptr(obj_reg, Address(lock_reg, obj_offset));
- if (DiagnoseSyncOnValueBasedClasses != 0) {
- load_klass(tmp_reg, obj_reg, rklass_decode_tmp);
- testb(Address(tmp_reg, Klass::misc_flags_offset()), KlassFlags::_misc_is_value_based_class);
- jcc(Assembler::notZero, slow_case);
- }
-
if (LockingMode == LM_LIGHTWEIGHT) {
- const Register thread = r15_thread;
- lightweight_lock(lock_reg, obj_reg, swap_reg, thread, tmp_reg, slow_case);
+ lightweight_lock(lock_reg, obj_reg, swap_reg, tmp_reg, slow_case);
} else if (LockingMode == LM_LEGACY) {
+ if (DiagnoseSyncOnValueBasedClasses != 0) {
+ load_klass(tmp_reg, obj_reg, rklass_decode_tmp);
+ testb(Address(tmp_reg, Klass::misc_flags_offset()), KlassFlags::_misc_is_value_based_class);
+ jcc(Assembler::notZero, slow_case);
+ }
+
// Load immediate 1 into swap_reg %rax
movl(swap_reg, 1);
@@ -1141,7 +1139,7 @@ void InterpreterMacroAssembler::unlock_object(Register lock_reg) {
movptr(Address(lock_reg, BasicObjectLock::obj_offset()), NULL_WORD);
if (LockingMode == LM_LIGHTWEIGHT) {
- lightweight_unlock(obj_reg, swap_reg, r15_thread, header_reg, slow_case);
+ lightweight_unlock(obj_reg, swap_reg, header_reg, slow_case);
} else if (LockingMode == LM_LEGACY) {
// Load the old header from BasicLock structure
movptr(header_reg, Address(swap_reg,
diff --git a/src/hotspot/cpu/x86/interp_masm_x86.hpp b/src/hotspot/cpu/x86/interp_masm_x86.hpp
index e537e9efc9678..308d700ff4fbf 100644
--- a/src/hotspot/cpu/x86/interp_masm_x86.hpp
+++ b/src/hotspot/cpu/x86/interp_masm_x86.hpp
@@ -42,7 +42,6 @@ class InterpreterMacroAssembler: public MacroAssembler {
protected:
virtual void call_VM_base(Register oop_result,
- Register java_thread,
Register last_java_sp,
address entry_point,
int number_of_arguments,
@@ -58,8 +57,8 @@ class InterpreterMacroAssembler: public MacroAssembler {
void jump_to_entry(address entry);
- virtual void check_and_handle_popframe(Register java_thread);
- virtual void check_and_handle_earlyret(Register java_thread);
+ virtual void check_and_handle_popframe();
+ virtual void check_and_handle_earlyret();
void load_earlyret_value(TosState state);
diff --git a/src/hotspot/cpu/x86/macroAssembler_x86.cpp b/src/hotspot/cpu/x86/macroAssembler_x86.cpp
index c92ce2f283cea..35e461b601f0f 100644
--- a/src/hotspot/cpu/x86/macroAssembler_x86.cpp
+++ b/src/hotspot/cpu/x86/macroAssembler_x86.cpp
@@ -24,6 +24,7 @@
#include "asm/assembler.hpp"
#include "asm/assembler.inline.hpp"
+#include "code/aotCodeCache.hpp"
#include "code/compiledIC.hpp"
#include "compiler/compiler_globals.hpp"
#include "compiler/disassembler.hpp"
@@ -93,395 +94,6 @@ static const Assembler::Condition reverse[] = {
// Implementation of MacroAssembler
-// First all the versions that have distinct versions depending on 32/64 bit
-// Unless the difference is trivial (1 line or so).
-
-#ifndef _LP64
-
-// 32bit versions
-
-Address MacroAssembler::as_Address(AddressLiteral adr) {
- return Address(adr.target(), adr.rspec());
-}
-
-Address MacroAssembler::as_Address(ArrayAddress adr, Register rscratch) {
- assert(rscratch == noreg, "");
- return Address::make_array(adr);
-}
-
-void MacroAssembler::call_VM_leaf_base(address entry_point,
- int number_of_arguments) {
- call(RuntimeAddress(entry_point));
- increment(rsp, number_of_arguments * wordSize);
-}
-
-void MacroAssembler::cmpklass(Address src1, Metadata* obj) {
- cmp_literal32(src1, (int32_t)obj, metadata_Relocation::spec_for_immediate());
-}
-
-
-void MacroAssembler::cmpklass(Register src1, Metadata* obj) {
- cmp_literal32(src1, (int32_t)obj, metadata_Relocation::spec_for_immediate());
-}
-
-void MacroAssembler::cmpoop(Address src1, jobject obj) {
- cmp_literal32(src1, (int32_t)obj, oop_Relocation::spec_for_immediate());
-}
-
-void MacroAssembler::cmpoop(Register src1, jobject obj, Register rscratch) {
- assert(rscratch == noreg, "redundant");
- cmp_literal32(src1, (int32_t)obj, oop_Relocation::spec_for_immediate());
-}
-
-void MacroAssembler::extend_sign(Register hi, Register lo) {
- // According to Intel Doc. AP-526, "Integer Divide", p.18.
- if (VM_Version::is_P6() && hi == rdx && lo == rax) {
- cdql();
- } else {
- movl(hi, lo);
- sarl(hi, 31);
- }
-}
-
-void MacroAssembler::jC2(Register tmp, Label& L) {
- // set parity bit if FPU flag C2 is set (via rax)
- save_rax(tmp);
- fwait(); fnstsw_ax();
- sahf();
- restore_rax(tmp);
- // branch
- jcc(Assembler::parity, L);
-}
-
-void MacroAssembler::jnC2(Register tmp, Label& L) {
- // set parity bit if FPU flag C2 is set (via rax)
- save_rax(tmp);
- fwait(); fnstsw_ax();
- sahf();
- restore_rax(tmp);
- // branch
- jcc(Assembler::noParity, L);
-}
-
-// 32bit can do a case table jump in one instruction but we no longer allow the base
-// to be installed in the Address class
-void MacroAssembler::jump(ArrayAddress entry, Register rscratch) {
- assert(rscratch == noreg, "not needed");
- jmp(as_Address(entry, noreg));
-}
-
-// Note: y_lo will be destroyed
-void MacroAssembler::lcmp2int(Register x_hi, Register x_lo, Register y_hi, Register y_lo) {
- // Long compare for Java (semantics as described in JVM spec.)
- Label high, low, done;
-
- cmpl(x_hi, y_hi);
- jcc(Assembler::less, low);
- jcc(Assembler::greater, high);
- // x_hi is the return register
- xorl(x_hi, x_hi);
- cmpl(x_lo, y_lo);
- jcc(Assembler::below, low);
- jcc(Assembler::equal, done);
-
- bind(high);
- xorl(x_hi, x_hi);
- increment(x_hi);
- jmp(done);
-
- bind(low);
- xorl(x_hi, x_hi);
- decrementl(x_hi);
-
- bind(done);
-}
-
-void MacroAssembler::lea(Register dst, AddressLiteral src) {
- mov_literal32(dst, (int32_t)src.target(), src.rspec());
-}
-
-void MacroAssembler::lea(Address dst, AddressLiteral adr, Register rscratch) {
- assert(rscratch == noreg, "not needed");
-
- // leal(dst, as_Address(adr));
- // see note in movl as to why we must use a move
- mov_literal32(dst, (int32_t)adr.target(), adr.rspec());
-}
-
-void MacroAssembler::leave() {
- mov(rsp, rbp);
- pop(rbp);
-}
-
-void MacroAssembler::lmul(int x_rsp_offset, int y_rsp_offset) {
- // Multiplication of two Java long values stored on the stack
- // as illustrated below. Result is in rdx:rax.
- //
- // rsp ---> [ ?? ] \ \
- // .... | y_rsp_offset |
- // [ y_lo ] / (in bytes) | x_rsp_offset
- // [ y_hi ] | (in bytes)
- // .... |
- // [ x_lo ] /
- // [ x_hi ]
- // ....
- //
- // Basic idea: lo(result) = lo(x_lo * y_lo)
- // hi(result) = hi(x_lo * y_lo) + lo(x_hi * y_lo) + lo(x_lo * y_hi)
- Address x_hi(rsp, x_rsp_offset + wordSize); Address x_lo(rsp, x_rsp_offset);
- Address y_hi(rsp, y_rsp_offset + wordSize); Address y_lo(rsp, y_rsp_offset);
- Label quick;
- // load x_hi, y_hi and check if quick
- // multiplication is possible
- movl(rbx, x_hi);
- movl(rcx, y_hi);
- movl(rax, rbx);
- orl(rbx, rcx); // rbx, = 0 <=> x_hi = 0 and y_hi = 0
- jcc(Assembler::zero, quick); // if rbx, = 0 do quick multiply
- // do full multiplication
- // 1st step
- mull(y_lo); // x_hi * y_lo
- movl(rbx, rax); // save lo(x_hi * y_lo) in rbx,
- // 2nd step
- movl(rax, x_lo);
- mull(rcx); // x_lo * y_hi
- addl(rbx, rax); // add lo(x_lo * y_hi) to rbx,
- // 3rd step
- bind(quick); // note: rbx, = 0 if quick multiply!
- movl(rax, x_lo);
- mull(y_lo); // x_lo * y_lo
- addl(rdx, rbx); // correct hi(x_lo * y_lo)
-}
-
-void MacroAssembler::lneg(Register hi, Register lo) {
- negl(lo);
- adcl(hi, 0);
- negl(hi);
-}
-
-void MacroAssembler::lshl(Register hi, Register lo) {
- // Java shift left long support (semantics as described in JVM spec., p.305)
- // (basic idea for shift counts s >= n: x << s == (x << n) << (s - n))
- // shift value is in rcx !
- assert(hi != rcx, "must not use rcx");
- assert(lo != rcx, "must not use rcx");
- const Register s = rcx; // shift count
- const int n = BitsPerWord;
- Label L;
- andl(s, 0x3f); // s := s & 0x3f (s < 0x40)
- cmpl(s, n); // if (s < n)
- jcc(Assembler::less, L); // else (s >= n)
- movl(hi, lo); // x := x << n
- xorl(lo, lo);
- // Note: subl(s, n) is not needed since the Intel shift instructions work rcx mod n!
- bind(L); // s (mod n) < n
- shldl(hi, lo); // x := x << s
- shll(lo);
-}
-
-
-void MacroAssembler::lshr(Register hi, Register lo, bool sign_extension) {
- // Java shift right long support (semantics as described in JVM spec., p.306 & p.310)
- // (basic idea for shift counts s >= n: x >> s == (x >> n) >> (s - n))
- assert(hi != rcx, "must not use rcx");
- assert(lo != rcx, "must not use rcx");
- const Register s = rcx; // shift count
- const int n = BitsPerWord;
- Label L;
- andl(s, 0x3f); // s := s & 0x3f (s < 0x40)
- cmpl(s, n); // if (s < n)
- jcc(Assembler::less, L); // else (s >= n)
- movl(lo, hi); // x := x >> n
- if (sign_extension) sarl(hi, 31);
- else xorl(hi, hi);
- // Note: subl(s, n) is not needed since the Intel shift instructions work rcx mod n!
- bind(L); // s (mod n) < n
- shrdl(lo, hi); // x := x >> s
- if (sign_extension) sarl(hi);
- else shrl(hi);
-}
-
-void MacroAssembler::movoop(Register dst, jobject obj) {
- mov_literal32(dst, (int32_t)obj, oop_Relocation::spec_for_immediate());
-}
-
-void MacroAssembler::movoop(Address dst, jobject obj, Register rscratch) {
- assert(rscratch == noreg, "redundant");
- mov_literal32(dst, (int32_t)obj, oop_Relocation::spec_for_immediate());
-}
-
-void MacroAssembler::mov_metadata(Register dst, Metadata* obj) {
- mov_literal32(dst, (int32_t)obj, metadata_Relocation::spec_for_immediate());
-}
-
-void MacroAssembler::mov_metadata(Address dst, Metadata* obj, Register rscratch) {
- assert(rscratch == noreg, "redundant");
- mov_literal32(dst, (int32_t)obj, metadata_Relocation::spec_for_immediate());
-}
-
-void MacroAssembler::movptr(Register dst, AddressLiteral src) {
- if (src.is_lval()) {
- mov_literal32(dst, (intptr_t)src.target(), src.rspec());
- } else {
- movl(dst, as_Address(src));
- }
-}
-
-void MacroAssembler::movptr(ArrayAddress dst, Register src, Register rscratch) {
- assert(rscratch == noreg, "redundant");
- movl(as_Address(dst, noreg), src);
-}
-
-void MacroAssembler::movptr(Register dst, ArrayAddress src) {
- movl(dst, as_Address(src, noreg));
-}
-
-void MacroAssembler::movptr(Address dst, intptr_t src, Register rscratch) {
- assert(rscratch == noreg, "redundant");
- movl(dst, src);
-}
-
-void MacroAssembler::pushoop(jobject obj, Register rscratch) {
- assert(rscratch == noreg, "redundant");
- push_literal32((int32_t)obj, oop_Relocation::spec_for_immediate());
-}
-
-void MacroAssembler::pushklass(Metadata* obj, Register rscratch) {
- assert(rscratch == noreg, "redundant");
- push_literal32((int32_t)obj, metadata_Relocation::spec_for_immediate());
-}
-
-void MacroAssembler::pushptr(AddressLiteral src, Register rscratch) {
- assert(rscratch == noreg, "redundant");
- if (src.is_lval()) {
- push_literal32((int32_t)src.target(), src.rspec());
- } else {
- pushl(as_Address(src));
- }
-}
-
-static void pass_arg0(MacroAssembler* masm, Register arg) {
- masm->push(arg);
-}
-
-static void pass_arg1(MacroAssembler* masm, Register arg) {
- masm->push(arg);
-}
-
-static void pass_arg2(MacroAssembler* masm, Register arg) {
- masm->push(arg);
-}
-
-static void pass_arg3(MacroAssembler* masm, Register arg) {
- masm->push(arg);
-}
-
-#ifndef PRODUCT
-extern "C" void findpc(intptr_t x);
-#endif
-
-void MacroAssembler::debug32(int rdi, int rsi, int rbp, int rsp, int rbx, int rdx, int rcx, int rax, int eip, char* msg) {
- // In order to get locks to work, we need to fake a in_VM state
- JavaThread* thread = JavaThread::current();
- JavaThreadState saved_state = thread->thread_state();
- thread->set_thread_state(_thread_in_vm);
- if (ShowMessageBoxOnError) {
- JavaThread* thread = JavaThread::current();
- JavaThreadState saved_state = thread->thread_state();
- thread->set_thread_state(_thread_in_vm);
- if (CountBytecodes || TraceBytecodes || StopInterpreterAt) {
- ttyLocker ttyl;
- BytecodeCounter::print();
- }
- // To see where a verify_oop failed, get $ebx+40/X for this frame.
- // This is the value of eip which points to where verify_oop will return.
- if (os::message_box(msg, "Execution stopped, print registers?")) {
- print_state32(rdi, rsi, rbp, rsp, rbx, rdx, rcx, rax, eip);
- BREAKPOINT;
- }
- }
- fatal("DEBUG MESSAGE: %s", msg);
-}
-
-void MacroAssembler::print_state32(int rdi, int rsi, int rbp, int rsp, int rbx, int rdx, int rcx, int rax, int eip) {
- ttyLocker ttyl;
- DebuggingContext debugging{};
- tty->print_cr("eip = 0x%08x", eip);
-#ifndef PRODUCT
- if ((WizardMode || Verbose) && PrintMiscellaneous) {
- tty->cr();
- findpc(eip);
- tty->cr();
- }
-#endif
-#define PRINT_REG(rax) \
- { tty->print("%s = ", #rax); os::print_location(tty, rax); }
- PRINT_REG(rax);
- PRINT_REG(rbx);
- PRINT_REG(rcx);
- PRINT_REG(rdx);
- PRINT_REG(rdi);
- PRINT_REG(rsi);
- PRINT_REG(rbp);
- PRINT_REG(rsp);
-#undef PRINT_REG
- // Print some words near top of staack.
- int* dump_sp = (int*) rsp;
- for (int col1 = 0; col1 < 8; col1++) {
- tty->print("(rsp+0x%03x) 0x%08x: ", (int)((intptr_t)dump_sp - (intptr_t)rsp), (intptr_t)dump_sp);
- os::print_location(tty, *dump_sp++);
- }
- for (int row = 0; row < 16; row++) {
- tty->print("(rsp+0x%03x) 0x%08x: ", (int)((intptr_t)dump_sp - (intptr_t)rsp), (intptr_t)dump_sp);
- for (int col = 0; col < 8; col++) {
- tty->print(" 0x%08x", *dump_sp++);
- }
- tty->cr();
- }
- // Print some instructions around pc:
- Disassembler::decode((address)eip-64, (address)eip);
- tty->print_cr("--------");
- Disassembler::decode((address)eip, (address)eip+32);
-}
-
-void MacroAssembler::stop(const char* msg) {
- // push address of message
- ExternalAddress message((address)msg);
- pushptr(message.addr(), noreg);
- { Label L; call(L, relocInfo::none); bind(L); } // push eip
- pusha(); // push registers
- call(RuntimeAddress(CAST_FROM_FN_PTR(address, MacroAssembler::debug32)));
- hlt();
-}
-
-void MacroAssembler::warn(const char* msg) {
- push_CPU_state();
-
- // push address of message
- ExternalAddress message((address)msg);
- pushptr(message.addr(), noreg);
-
- call(RuntimeAddress(CAST_FROM_FN_PTR(address, warning)));
- addl(rsp, wordSize); // discard argument
- pop_CPU_state();
-}
-
-void MacroAssembler::print_state() {
- { Label L; call(L, relocInfo::none); bind(L); } // push eip
- pusha(); // push registers
-
- push_CPU_state();
- call(RuntimeAddress(CAST_FROM_FN_PTR(address, MacroAssembler::print_state32)));
- pop_CPU_state();
-
- popa();
- addl(rsp, wordSize);
-}
-
-#else // _LP64
-
-// 64 bit versions
-
Address MacroAssembler::as_Address(AddressLiteral adr) {
// amd64 always does this as a pc-rel
// we can be absolute or disp based on the instruction type
@@ -724,17 +336,6 @@ void MacroAssembler::pushptr(AddressLiteral src, Register rscratch) {
}
}
-void MacroAssembler::reset_last_Java_frame(bool clear_fp) {
- reset_last_Java_frame(r15_thread, clear_fp);
-}
-
-void MacroAssembler::set_last_Java_frame(Register last_java_sp,
- Register last_java_fp,
- address last_java_pc,
- Register rscratch) {
- set_last_Java_frame(r15_thread, last_java_sp, last_java_fp, last_java_pc, rscratch);
-}
-
static void pass_arg0(MacroAssembler* masm, Register arg) {
if (c_rarg0 != arg ) {
masm->mov(c_rarg0, arg);
@@ -766,7 +367,9 @@ void MacroAssembler::stop(const char* msg) {
lea(c_rarg1, InternalAddress(rip));
movq(c_rarg2, rsp); // pass pointer to regs array
}
- lea(c_rarg0, ExternalAddress((address) msg));
+ // Skip AOT caching C strings in scratch buffer.
+ const char* str = (code_section()->scratch_emit()) ? msg : AOTCodeCache::add_C_string(msg);
+ lea(c_rarg0, ExternalAddress((address) str));
andq(rsp, -16); // align stack as required by ABI
call(RuntimeAddress(CAST_FROM_FN_PTR(address, MacroAssembler::debug64)));
hlt();
@@ -1104,20 +707,16 @@ void MacroAssembler::object_move(OopMap* map,
}
}
-#endif // _LP64
-
-// Now versions that are common to 32/64 bit
-
void MacroAssembler::addptr(Register dst, int32_t imm32) {
- LP64_ONLY(addq(dst, imm32)) NOT_LP64(addl(dst, imm32));
+ addq(dst, imm32);
}
void MacroAssembler::addptr(Register dst, Register src) {
- LP64_ONLY(addq(dst, src)) NOT_LP64(addl(dst, src));
+ addq(dst, src);
}
void MacroAssembler::addptr(Address dst, Register src) {
- LP64_ONLY(addq(dst, src)) NOT_LP64(addl(dst, src));
+ addq(dst, src);
}
void MacroAssembler::addsd(XMMRegister dst, AddressLiteral src, Register rscratch) {
@@ -1227,10 +826,9 @@ void MacroAssembler::andps(XMMRegister dst, AddressLiteral src, Register rscratc
}
void MacroAssembler::andptr(Register dst, int32_t imm32) {
- LP64_ONLY(andq(dst, imm32)) NOT_LP64(andl(dst, imm32));
+ andq(dst, imm32);
}
-#ifdef _LP64
void MacroAssembler::andq(Register dst, AddressLiteral src, Register rscratch) {
assert(rscratch != noreg || always_reachable(src), "missing");
@@ -1241,7 +839,6 @@ void MacroAssembler::andq(Register dst, AddressLiteral src, Register rscratch) {
andq(dst, Address(rscratch, 0));
}
}
-#endif
void MacroAssembler::atomic_incl(Address counter_addr) {
lock();
@@ -1259,7 +856,6 @@ void MacroAssembler::atomic_incl(AddressLiteral counter_addr, Register rscratch)
}
}
-#ifdef _LP64
void MacroAssembler::atomic_incq(Address counter_addr) {
lock();
incrementq(counter_addr);
@@ -1275,7 +871,6 @@ void MacroAssembler::atomic_incq(AddressLiteral counter_addr, Register rscratch)
atomic_incq(Address(rscratch, 0));
}
}
-#endif
// Writes to stack successive pages until offset reached to check for
// stack overflow + shadow pages. This clobbers tmp.
@@ -1307,13 +902,11 @@ void MacroAssembler::bang_stack_size(Register size, Register tmp) {
void MacroAssembler::reserved_stack_check() {
// testing if reserved zone needs to be enabled
Label no_reserved_zone_enabling;
- Register thread = NOT_LP64(rsi) LP64_ONLY(r15_thread);
- NOT_LP64(get_thread(rsi);)
- cmpptr(rsp, Address(thread, JavaThread::reserved_stack_activation_offset()));
+ cmpptr(rsp, Address(r15_thread, JavaThread::reserved_stack_activation_offset()));
jcc(Assembler::below, no_reserved_zone_enabling);
- call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::enable_stack_reserved_zone), thread);
+ call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::enable_stack_reserved_zone), r15_thread);
jump(RuntimeAddress(SharedRuntime::throw_delayed_StackOverflowError_entry()));
should_not_reach_here();
@@ -1351,24 +944,19 @@ void MacroAssembler::call(AddressLiteral entry, Register rscratch) {
void MacroAssembler::ic_call(address entry, jint method_index) {
RelocationHolder rh = virtual_call_Relocation::spec(pc(), method_index);
-#ifdef _LP64
// Needs full 64-bit immediate for later patching.
mov64(rax, (int64_t)Universe::non_oop_word());
-#else
- movptr(rax, (intptr_t)Universe::non_oop_word());
-#endif
call(AddressLiteral(entry, rh));
}
int MacroAssembler::ic_check_size() {
- return
- LP64_ONLY(UseCompactObjectHeaders ? 17 : 14) NOT_LP64(12);
+ return UseCompactObjectHeaders ? 17 : 14;
}
int MacroAssembler::ic_check(int end_alignment) {
- Register receiver = LP64_ONLY(j_rarg0) NOT_LP64(rcx);
+ Register receiver = j_rarg0;
Register data = rax;
- Register temp = LP64_ONLY(rscratch1) NOT_LP64(rbx);
+ Register temp = rscratch1;
// The UEP of a code blob ensures that the VEP is padded. However, the padding of the UEP is placed
// before the inline cache check, so we don't have to execute any nop instructions when dispatching
@@ -1378,13 +966,10 @@ int MacroAssembler::ic_check(int end_alignment) {
int uep_offset = offset();
-#ifdef _LP64
if (UseCompactObjectHeaders) {
load_narrow_klass_compact(temp, receiver);
cmpl(temp, Address(data, CompiledICData::speculated_klass_offset()));
- } else
-#endif
- if (UseCompressedClassPointers) {
+ } else if (UseCompressedClassPointers) {
movl(temp, Address(receiver, oopDesc::klass_offset_in_bytes()));
cmpl(temp, Address(data, CompiledICData::speculated_klass_offset()));
} else {
@@ -1449,7 +1034,7 @@ void MacroAssembler::call_VM(Register oop_result,
bind(C);
- LP64_ONLY(assert_different_registers(arg_1, c_rarg2));
+ assert_different_registers(arg_1, c_rarg2);
pass_arg2(this, arg_2);
pass_arg1(this, arg_1);
@@ -1471,8 +1056,8 @@ void MacroAssembler::call_VM(Register oop_result,
bind(C);
- LP64_ONLY(assert_different_registers(arg_1, c_rarg2, c_rarg3));
- LP64_ONLY(assert_different_registers(arg_2, c_rarg3));
+ assert_different_registers(arg_1, c_rarg2, c_rarg3);
+ assert_different_registers(arg_2, c_rarg3);
pass_arg3(this, arg_3);
pass_arg2(this, arg_2);
pass_arg1(this, arg_1);
@@ -1487,8 +1072,7 @@ void MacroAssembler::call_VM(Register oop_result,
address entry_point,
int number_of_arguments,
bool check_exceptions) {
- Register thread = LP64_ONLY(r15_thread) NOT_LP64(noreg);
- call_VM_base(oop_result, thread, last_java_sp, entry_point, number_of_arguments, check_exceptions);
+ call_VM_base(oop_result, last_java_sp, entry_point, number_of_arguments, check_exceptions);
}
void MacroAssembler::call_VM(Register oop_result,
@@ -1507,7 +1091,7 @@ void MacroAssembler::call_VM(Register oop_result,
Register arg_2,
bool check_exceptions) {
- LP64_ONLY(assert_different_registers(arg_1, c_rarg2));
+ assert_different_registers(arg_1, c_rarg2);
pass_arg2(this, arg_2);
pass_arg1(this, arg_1);
call_VM(oop_result, last_java_sp, entry_point, 2, check_exceptions);
@@ -1520,8 +1104,8 @@ void MacroAssembler::call_VM(Register oop_result,
Register arg_2,
Register arg_3,
bool check_exceptions) {
- LP64_ONLY(assert_different_registers(arg_1, c_rarg2, c_rarg3));
- LP64_ONLY(assert_different_registers(arg_2, c_rarg3));
+ assert_different_registers(arg_1, c_rarg2, c_rarg3);
+ assert_different_registers(arg_2, c_rarg3);
pass_arg3(this, arg_3);
pass_arg2(this, arg_2);
pass_arg1(this, arg_1);
@@ -1533,8 +1117,7 @@ void MacroAssembler::super_call_VM(Register oop_result,
address entry_point,
int number_of_arguments,
bool check_exceptions) {
- Register thread = LP64_ONLY(r15_thread) NOT_LP64(noreg);
- MacroAssembler::call_VM_base(oop_result, thread, last_java_sp, entry_point, number_of_arguments, check_exceptions);
+ MacroAssembler::call_VM_base(oop_result, last_java_sp, entry_point, number_of_arguments, check_exceptions);
}
void MacroAssembler::super_call_VM(Register oop_result,
@@ -1553,7 +1136,7 @@ void MacroAssembler::super_call_VM(Register oop_result,
Register arg_2,
bool check_exceptions) {
- LP64_ONLY(assert_different_registers(arg_1, c_rarg2));
+ assert_different_registers(arg_1, c_rarg2);
pass_arg2(this, arg_2);
pass_arg1(this, arg_1);
super_call_VM(oop_result, last_java_sp, entry_point, 2, check_exceptions);
@@ -1566,8 +1149,8 @@ void MacroAssembler::super_call_VM(Register oop_result,
Register arg_2,
Register arg_3,
bool check_exceptions) {
- LP64_ONLY(assert_different_registers(arg_1, c_rarg2, c_rarg3));
- LP64_ONLY(assert_different_registers(arg_2, c_rarg3));
+ assert_different_registers(arg_1, c_rarg2, c_rarg3);
+ assert_different_registers(arg_2, c_rarg3);
pass_arg3(this, arg_3);
pass_arg2(this, arg_2);
pass_arg1(this, arg_1);
@@ -1575,31 +1158,22 @@ void MacroAssembler::super_call_VM(Register oop_result,
}
void MacroAssembler::call_VM_base(Register oop_result,
- Register java_thread,
Register last_java_sp,
address entry_point,
int number_of_arguments,
bool check_exceptions) {
- // determine java_thread register
- if (!java_thread->is_valid()) {
-#ifdef _LP64
- java_thread = r15_thread;
-#else
- java_thread = rdi;
- get_thread(java_thread);
-#endif // LP64
- }
+ Register java_thread = r15_thread;
+
// determine last_java_sp register
if (!last_java_sp->is_valid()) {
last_java_sp = rsp;
}
// debugging support
assert(number_of_arguments >= 0 , "cannot have negative number of arguments");
- LP64_ONLY(assert(java_thread == r15_thread, "unexpected register"));
#ifdef ASSERT
// TraceBytecodes does not use r12 but saves it over the call, so don't verify
// r12 is the heapbase.
- LP64_ONLY(if (UseCompressedOops && !TraceBytecodes) verify_heapbase("call_VM_base: heap base corrupted?");)
+ if (UseCompressedOops && !TraceBytecodes) verify_heapbase("call_VM_base: heap base corrupted?");
#endif // ASSERT
assert(java_thread != oop_result , "cannot use the same register for java_thread & oop_result");
@@ -1607,53 +1181,42 @@ void MacroAssembler::call_VM_base(Register oop_result,
// push java thread (becomes first argument of C function)
- NOT_LP64(push(java_thread); number_of_arguments++);
- LP64_ONLY(mov(c_rarg0, r15_thread));
+ mov(c_rarg0, r15_thread);
// set last Java frame before call
assert(last_java_sp != rbp, "can't use ebp/rbp");
// Only interpreter should have to set fp
- set_last_Java_frame(java_thread, last_java_sp, rbp, nullptr, rscratch1);
+ set_last_Java_frame(last_java_sp, rbp, nullptr, rscratch1);
// do the call, remove parameters
MacroAssembler::call_VM_leaf_base(entry_point, number_of_arguments);
- // restore the thread (cannot use the pushed argument since arguments
- // may be overwritten by C code generated by an optimizing compiler);
- // however can use the register value directly if it is callee saved.
- if (LP64_ONLY(true ||) java_thread == rdi || java_thread == rsi) {
- // rdi & rsi (also r15) are callee saved -> nothing to do
#ifdef ASSERT
- guarantee(java_thread != rax, "change this code");
- push(rax);
- { Label L;
- get_thread(rax);
- cmpptr(java_thread, rax);
- jcc(Assembler::equal, L);
- STOP("MacroAssembler::call_VM_base: rdi not callee saved?");
- bind(L);
- }
- pop(rax);
-#endif
- } else {
- get_thread(java_thread);
+ // Check that thread register is not clobbered.
+ guarantee(java_thread != rax, "change this code");
+ push(rax);
+ { Label L;
+ get_thread_slow(rax);
+ cmpptr(java_thread, rax);
+ jcc(Assembler::equal, L);
+ STOP("MacroAssembler::call_VM_base: java_thread not callee saved?");
+ bind(L);
}
+ pop(rax);
+#endif
+
// reset last Java frame
// Only interpreter should have to clear fp
- reset_last_Java_frame(java_thread, true);
+ reset_last_Java_frame(true);
// C++ interp handles this in the interpreter
- check_and_handle_popframe(java_thread);
- check_and_handle_earlyret(java_thread);
+ check_and_handle_popframe();
+ check_and_handle_earlyret();
if (check_exceptions) {
// check for pending exceptions (java_thread is set upon return)
- cmpptr(Address(java_thread, Thread::pending_exception_offset()), NULL_WORD);
-#ifndef _LP64
- jump_cc(Assembler::notEqual,
- RuntimeAddress(StubRoutines::forward_exception_entry()));
-#else
+ cmpptr(Address(r15_thread, Thread::pending_exception_offset()), NULL_WORD);
// This used to conditionally jump to forward_exception however it is
// possible if we relocate that the branch will not reach. So we must jump
// around so we can always reach
@@ -1662,36 +1225,24 @@ void MacroAssembler::call_VM_base(Register oop_result,
jcc(Assembler::equal, ok);
jump(RuntimeAddress(StubRoutines::forward_exception_entry()));
bind(ok);
-#endif // LP64
}
// get oop result if there is one and reset the value in the thread
if (oop_result->is_valid()) {
- get_vm_result(oop_result, java_thread);
+ get_vm_result_oop(oop_result);
}
}
void MacroAssembler::call_VM_helper(Register oop_result, address entry_point, int number_of_arguments, bool check_exceptions) {
+ // Calculate the value for last_Java_sp somewhat subtle.
+ // call_VM does an intermediate call which places a return address on
+ // the stack just under the stack pointer as the user finished with it.
+ // This allows use to retrieve last_Java_pc from last_Java_sp[-1].
- // Calculate the value for last_Java_sp
- // somewhat subtle. call_VM does an intermediate call
- // which places a return address on the stack just under the
- // stack pointer as the user finished with it. This allows
- // use to retrieve last_Java_pc from last_Java_sp[-1].
- // On 32bit we then have to push additional args on the stack to accomplish
- // the actual requested call. On 64bit call_VM only can use register args
- // so the only extra space is the return address that call_VM created.
- // This hopefully explains the calculations here.
-
-#ifdef _LP64
// We've pushed one address, correct last_Java_sp
lea(rax, Address(rsp, wordSize));
-#else
- lea(rax, Address(rsp, (1 + number_of_arguments) * wordSize));
-#endif // LP64
-
- call_VM_base(oop_result, noreg, rax, entry_point, number_of_arguments, check_exceptions);
+ call_VM_base(oop_result, rax, entry_point, number_of_arguments, check_exceptions);
}
// Use this method when MacroAssembler version of call_VM_leaf_base() should be called from Interpreter.
@@ -1710,15 +1261,15 @@ void MacroAssembler::call_VM_leaf(address entry_point, Register arg_0) {
void MacroAssembler::call_VM_leaf(address entry_point, Register arg_0, Register arg_1) {
- LP64_ONLY(assert_different_registers(arg_0, c_rarg1));
+ assert_different_registers(arg_0, c_rarg1);
pass_arg1(this, arg_1);
pass_arg0(this, arg_0);
call_VM_leaf(entry_point, 2);
}
void MacroAssembler::call_VM_leaf(address entry_point, Register arg_0, Register arg_1, Register arg_2) {
- LP64_ONLY(assert_different_registers(arg_0, c_rarg1, c_rarg2));
- LP64_ONLY(assert_different_registers(arg_1, c_rarg2));
+ assert_different_registers(arg_0, c_rarg1, c_rarg2);
+ assert_different_registers(arg_1, c_rarg2);
pass_arg2(this, arg_2);
pass_arg1(this, arg_1);
pass_arg0(this, arg_0);
@@ -1726,9 +1277,9 @@ void MacroAssembler::call_VM_leaf(address entry_point, Register arg_0, Register
}
void MacroAssembler::call_VM_leaf(address entry_point, Register arg_0, Register arg_1, Register arg_2, Register arg_3) {
- LP64_ONLY(assert_different_registers(arg_0, c_rarg1, c_rarg2, c_rarg3));
- LP64_ONLY(assert_different_registers(arg_1, c_rarg2, c_rarg3));
- LP64_ONLY(assert_different_registers(arg_2, c_rarg3));
+ assert_different_registers(arg_0, c_rarg1, c_rarg2, c_rarg3);
+ assert_different_registers(arg_1, c_rarg2, c_rarg3);
+ assert_different_registers(arg_2, c_rarg3);
pass_arg3(this, arg_3);
pass_arg2(this, arg_2);
pass_arg1(this, arg_1);
@@ -1742,15 +1293,15 @@ void MacroAssembler::super_call_VM_leaf(address entry_point, Register arg_0) {
}
void MacroAssembler::super_call_VM_leaf(address entry_point, Register arg_0, Register arg_1) {
- LP64_ONLY(assert_different_registers(arg_0, c_rarg1));
+ assert_different_registers(arg_0, c_rarg1);
pass_arg1(this, arg_1);
pass_arg0(this, arg_0);
MacroAssembler::call_VM_leaf_base(entry_point, 2);
}
void MacroAssembler::super_call_VM_leaf(address entry_point, Register arg_0, Register arg_1, Register arg_2) {
- LP64_ONLY(assert_different_registers(arg_0, c_rarg1, c_rarg2));
- LP64_ONLY(assert_different_registers(arg_1, c_rarg2));
+ assert_different_registers(arg_0, c_rarg1, c_rarg2);
+ assert_different_registers(arg_1, c_rarg2);
pass_arg2(this, arg_2);
pass_arg1(this, arg_1);
pass_arg0(this, arg_0);
@@ -1758,9 +1309,9 @@ void MacroAssembler::super_call_VM_leaf(address entry_point, Register arg_0, Reg
}
void MacroAssembler::super_call_VM_leaf(address entry_point, Register arg_0, Register arg_1, Register arg_2, Register arg_3) {
- LP64_ONLY(assert_different_registers(arg_0, c_rarg1, c_rarg2, c_rarg3));
- LP64_ONLY(assert_different_registers(arg_1, c_rarg2, c_rarg3));
- LP64_ONLY(assert_different_registers(arg_2, c_rarg3));
+ assert_different_registers(arg_0, c_rarg1, c_rarg2, c_rarg3);
+ assert_different_registers(arg_1, c_rarg2, c_rarg3);
+ assert_different_registers(arg_2, c_rarg3);
pass_arg3(this, arg_3);
pass_arg2(this, arg_2);
pass_arg1(this, arg_1);
@@ -1768,21 +1319,21 @@ void MacroAssembler::super_call_VM_leaf(address entry_point, Register arg_0, Reg
MacroAssembler::call_VM_leaf_base(entry_point, 4);
}
-void MacroAssembler::get_vm_result(Register oop_result, Register java_thread) {
- movptr(oop_result, Address(java_thread, JavaThread::vm_result_offset()));
- movptr(Address(java_thread, JavaThread::vm_result_offset()), NULL_WORD);
+void MacroAssembler::get_vm_result_oop(Register oop_result) {
+ movptr(oop_result, Address(r15_thread, JavaThread::vm_result_oop_offset()));
+ movptr(Address(r15_thread, JavaThread::vm_result_oop_offset()), NULL_WORD);
verify_oop_msg(oop_result, "broken oop in call_VM_base");
}
-void MacroAssembler::get_vm_result_2(Register metadata_result, Register java_thread) {
- movptr(metadata_result, Address(java_thread, JavaThread::vm_result_2_offset()));
- movptr(Address(java_thread, JavaThread::vm_result_2_offset()), NULL_WORD);
+void MacroAssembler::get_vm_result_metadata(Register metadata_result) {
+ movptr(metadata_result, Address(r15_thread, JavaThread::vm_result_metadata_offset()));
+ movptr(Address(r15_thread, JavaThread::vm_result_metadata_offset()), NULL_WORD);
}
-void MacroAssembler::check_and_handle_earlyret(Register java_thread) {
+void MacroAssembler::check_and_handle_earlyret() {
}
-void MacroAssembler::check_and_handle_popframe(Register java_thread) {
+void MacroAssembler::check_and_handle_popframe() {
}
void MacroAssembler::cmp32(AddressLiteral src1, int32_t imm, Register rscratch) {
@@ -1873,7 +1424,6 @@ void MacroAssembler::cmp8(AddressLiteral src1, int imm, Register rscratch) {
}
void MacroAssembler::cmpptr(Register src1, AddressLiteral src2, Register rscratch) {
-#ifdef _LP64
assert(rscratch != noreg || always_reachable(src2), "missing");
if (src2.is_lval()) {
@@ -1885,26 +1435,13 @@ void MacroAssembler::cmpptr(Register src1, AddressLiteral src2, Register rscratc
lea(rscratch, src2);
Assembler::cmpq(src1, Address(rscratch, 0));
}
-#else
- assert(rscratch == noreg, "not needed");
- if (src2.is_lval()) {
- cmp_literal32(src1, (int32_t)src2.target(), src2.rspec());
- } else {
- cmpl(src1, as_Address(src2));
- }
-#endif // _LP64
}
void MacroAssembler::cmpptr(Address src1, AddressLiteral src2, Register rscratch) {
assert(src2.is_lval(), "not a mem-mem compare");
-#ifdef _LP64
// moves src2's literal address
movptr(rscratch, src2);
Assembler::cmpq(src1, rscratch);
-#else
- assert(rscratch == noreg, "not needed");
- cmp_literal32(src1, (int32_t)src2.target(), src2.rspec());
-#endif // _LP64
}
void MacroAssembler::cmpoop(Register src1, Register src2) {
@@ -1915,12 +1452,10 @@ void MacroAssembler::cmpoop(Register src1, Address src2) {
cmpptr(src1, src2);
}
-#ifdef _LP64
void MacroAssembler::cmpoop(Register src1, jobject src2, Register rscratch) {
movoop(rscratch, src2);
cmpptr(src1, rscratch);
}
-#endif
void MacroAssembler::locked_cmpxchgptr(Register reg, AddressLiteral adr, Register rscratch) {
assert(rscratch != noreg || always_reachable(adr), "missing");
@@ -1936,7 +1471,7 @@ void MacroAssembler::locked_cmpxchgptr(Register reg, AddressLiteral adr, Registe
}
void MacroAssembler::cmpxchgptr(Register reg, Address adr) {
- LP64_ONLY(cmpxchgq(reg, adr)) NOT_LP64(cmpxchgl(reg, adr));
+ cmpxchgq(reg, adr);
}
void MacroAssembler::comisd(XMMRegister dst, AddressLiteral src, Register rscratch) {
@@ -2099,115 +1634,6 @@ void MacroAssembler::fat_nop() {
}
}
-#ifndef _LP64
-void MacroAssembler::fcmp(Register tmp) {
- fcmp(tmp, 1, true, true);
-}
-
-void MacroAssembler::fcmp(Register tmp, int index, bool pop_left, bool pop_right) {
- assert(!pop_right || pop_left, "usage error");
- if (VM_Version::supports_cmov()) {
- assert(tmp == noreg, "unneeded temp");
- if (pop_left) {
- fucomip(index);
- } else {
- fucomi(index);
- }
- if (pop_right) {
- fpop();
- }
- } else {
- assert(tmp != noreg, "need temp");
- if (pop_left) {
- if (pop_right) {
- fcompp();
- } else {
- fcomp(index);
- }
- } else {
- fcom(index);
- }
- // convert FPU condition into eflags condition via rax,
- save_rax(tmp);
- fwait(); fnstsw_ax();
- sahf();
- restore_rax(tmp);
- }
- // condition codes set as follows:
- //
- // CF (corresponds to C0) if x < y
- // PF (corresponds to C2) if unordered
- // ZF (corresponds to C3) if x = y
-}
-
-void MacroAssembler::fcmp2int(Register dst, bool unordered_is_less) {
- fcmp2int(dst, unordered_is_less, 1, true, true);
-}
-
-void MacroAssembler::fcmp2int(Register dst, bool unordered_is_less, int index, bool pop_left, bool pop_right) {
- fcmp(VM_Version::supports_cmov() ? noreg : dst, index, pop_left, pop_right);
- Label L;
- if (unordered_is_less) {
- movl(dst, -1);
- jcc(Assembler::parity, L);
- jcc(Assembler::below , L);
- movl(dst, 0);
- jcc(Assembler::equal , L);
- increment(dst);
- } else { // unordered is greater
- movl(dst, 1);
- jcc(Assembler::parity, L);
- jcc(Assembler::above , L);
- movl(dst, 0);
- jcc(Assembler::equal , L);
- decrementl(dst);
- }
- bind(L);
-}
-
-void MacroAssembler::fld_d(AddressLiteral src) {
- fld_d(as_Address(src));
-}
-
-void MacroAssembler::fld_s(AddressLiteral src) {
- fld_s(as_Address(src));
-}
-
-void MacroAssembler::fldcw(AddressLiteral src) {
- fldcw(as_Address(src));
-}
-
-void MacroAssembler::fpop() {
- ffree();
- fincstp();
-}
-
-void MacroAssembler::fremr(Register tmp) {
- save_rax(tmp);
- { Label L;
- bind(L);
- fprem();
- fwait(); fnstsw_ax();
- sahf();
- jcc(Assembler::parity, L);
- }
- restore_rax(tmp);
- // Result is in ST0.
- // Note: fxch & fpop to get rid of ST1
- // (otherwise FPU stack could overflow eventually)
- fxch(1);
- fpop();
-}
-
-void MacroAssembler::empty_FPU_stack() {
- if (VM_Version::supports_mmx()) {
- emms();
- } else {
- for (int i = 8; i-- > 0; ) ffree(i);
- }
-}
-#endif // !LP64
-
void MacroAssembler::mulpd(XMMRegister dst, AddressLiteral src, Register rscratch) {
assert(rscratch != noreg || always_reachable(src), "missing");
if (reachable(src)) {
@@ -2218,54 +1644,6 @@ void MacroAssembler::mulpd(XMMRegister dst, AddressLiteral src, Register rscratc
}
}
-void MacroAssembler::load_float(Address src) {
-#ifdef _LP64
- movflt(xmm0, src);
-#else
- if (UseSSE >= 1) {
- movflt(xmm0, src);
- } else {
- fld_s(src);
- }
-#endif // LP64
-}
-
-void MacroAssembler::store_float(Address dst) {
-#ifdef _LP64
- movflt(dst, xmm0);
-#else
- if (UseSSE >= 1) {
- movflt(dst, xmm0);
- } else {
- fstp_s(dst);
- }
-#endif // LP64
-}
-
-void MacroAssembler::load_double(Address src) {
-#ifdef _LP64
- movdbl(xmm0, src);
-#else
- if (UseSSE >= 2) {
- movdbl(xmm0, src);
- } else {
- fld_d(src);
- }
-#endif // LP64
-}
-
-void MacroAssembler::store_double(Address dst) {
-#ifdef _LP64
- movdbl(dst, xmm0);
-#else
- if (UseSSE >= 2) {
- movdbl(dst, xmm0);
- } else {
- fstp_d(dst);
- }
-#endif // LP64
-}
-
// dst = c = a * b + c
void MacroAssembler::fmad(XMMRegister dst, XMMRegister a, XMMRegister b, XMMRegister c) {
Assembler::vfmadd231sd(c, a, b);
@@ -2415,15 +1793,8 @@ void MacroAssembler::ldmxcsr(AddressLiteral src, Register rscratch) {
}
int MacroAssembler::load_signed_byte(Register dst, Address src) {
- int off;
- if (LP64_ONLY(true ||) VM_Version::is_P6()) {
- off = offset();
- movsbl(dst, src); // movsxb
- } else {
- off = load_unsigned_byte(dst, src);
- shll(dst, 24);
- sarl(dst, 24);
- }
+ int off = offset();
+ movsbl(dst, src); // movsxb
return off;
}
@@ -2432,33 +1803,19 @@ int MacroAssembler::load_signed_byte(Register dst, Address src) {
// manual, which means 16 bits, that usage is found nowhere in HotSpot code.
// The term "word" in HotSpot means a 32- or 64-bit machine word.
int MacroAssembler::load_signed_short(Register dst, Address src) {
- int off;
- if (LP64_ONLY(true ||) VM_Version::is_P6()) {
- // This is dubious to me since it seems safe to do a signed 16 => 64 bit
- // version but this is what 64bit has always done. This seems to imply
- // that users are only using 32bits worth.
- off = offset();
- movswl(dst, src); // movsxw
- } else {
- off = load_unsigned_short(dst, src);
- shll(dst, 16);
- sarl(dst, 16);
- }
+ // This is dubious to me since it seems safe to do a signed 16 => 64 bit
+ // version but this is what 64bit has always done. This seems to imply
+ // that users are only using 32bits worth.
+ int off = offset();
+ movswl(dst, src); // movsxw
return off;
}
int MacroAssembler::load_unsigned_byte(Register dst, Address src) {
// According to Intel Doc. AP-526, "Zero-Extension of Short", p.16,
// and "3.9 Partial Register Penalties", p. 22).
- int off;
- if (LP64_ONLY(true || ) VM_Version::is_P6() || src.uses(dst)) {
- off = offset();
- movzbl(dst, src); // movzxb
- } else {
- xorl(dst, dst);
- off = offset();
- movb(dst, src);
- }
+ int off = offset();
+ movzbl(dst, src); // movzxb
return off;
}
@@ -2466,29 +1823,14 @@ int MacroAssembler::load_unsigned_byte(Register dst, Address src) {
int MacroAssembler::load_unsigned_short(Register dst, Address src) {
// According to Intel Doc. AP-526, "Zero-Extension of Short", p.16,
// and "3.9 Partial Register Penalties", p. 22).
- int off;
- if (LP64_ONLY(true ||) VM_Version::is_P6() || src.uses(dst)) {
- off = offset();
- movzwl(dst, src); // movzxw
- } else {
- xorl(dst, dst);
- off = offset();
- movw(dst, src);
- }
+ int off = offset();
+ movzwl(dst, src); // movzxw
return off;
}
void MacroAssembler::load_sized_value(Register dst, Address src, size_t size_in_bytes, bool is_signed, Register dst2) {
switch (size_in_bytes) {
-#ifndef _LP64
- case 8:
- assert(dst2 != noreg, "second dest register required");
- movl(dst, src);
- movl(dst2, src.plus_disp(BytesPerInt));
- break;
-#else
case 8: movq(dst, src); break;
-#endif
case 4: movl(dst, src); break;
case 2: is_signed ? load_signed_short(dst, src) : load_unsigned_short(dst, src); break;
case 1: is_signed ? load_signed_byte( dst, src) : load_unsigned_byte( dst, src); break;
@@ -2498,15 +1840,7 @@ void MacroAssembler::load_sized_value(Register dst, Address src, size_t size_in_
void MacroAssembler::store_sized_value(Address dst, Register src, size_t size_in_bytes, Register src2) {
switch (size_in_bytes) {
-#ifndef _LP64
- case 8:
- assert(src2 != noreg, "second source register required");
- movl(dst, src);
- movl(dst.plus_disp(BytesPerInt), src2);
- break;
-#else
case 8: movq(dst, src); break;
-#endif
case 4: movl(dst, src); break;
case 2: movw(dst, src); break;
case 1: movb(dst, src); break;
@@ -2625,16 +1959,15 @@ void MacroAssembler::movflt(XMMRegister dst, AddressLiteral src, Register rscrat
}
void MacroAssembler::movptr(Register dst, Register src) {
- LP64_ONLY(movq(dst, src)) NOT_LP64(movl(dst, src));
+ movq(dst, src);
}
void MacroAssembler::movptr(Register dst, Address src) {
- LP64_ONLY(movq(dst, src)) NOT_LP64(movl(dst, src));
+ movq(dst, src);
}
// src should NEVER be a real pointer. Use AddressLiteral for true pointers
void MacroAssembler::movptr(Register dst, intptr_t src) {
-#ifdef _LP64
if (is_uimm32(src)) {
movl(dst, checked_cast(src));
} else if (is_simm32(src)) {
@@ -2642,17 +1975,14 @@ void MacroAssembler::movptr(Register dst, intptr_t src) {
} else {
mov64(dst, src);
}
-#else
- movl(dst, src);
-#endif
}
void MacroAssembler::movptr(Address dst, Register src) {
- LP64_ONLY(movq(dst, src)) NOT_LP64(movl(dst, src));
+ movq(dst, src);
}
void MacroAssembler::movptr(Address dst, int32_t src) {
- LP64_ONLY(movslq(dst, src)) NOT_LP64(movl(dst, src));
+ movslq(dst, src);
}
void MacroAssembler::movdqu(Address dst, XMMRegister src) {
@@ -3030,9 +2360,7 @@ void MacroAssembler::unimplemented(const char* what) {
stop(buf);
}
-#ifdef _LP64
#define XSTATE_BV 0x200
-#endif
void MacroAssembler::pop_CPU_state() {
pop_FPU_state();
@@ -3040,17 +2368,13 @@ void MacroAssembler::pop_CPU_state() {
}
void MacroAssembler::pop_FPU_state() {
-#ifndef _LP64
- frstor(Address(rsp, 0));
-#else
fxrstor(Address(rsp, 0));
-#endif
addptr(rsp, FPUStateSizeInWords * wordSize);
}
void MacroAssembler::pop_IU_state() {
popa();
- LP64_ONLY(addq(rsp, 8));
+ addq(rsp, 8);
popf();
}
@@ -3063,154 +2387,83 @@ void MacroAssembler::push_CPU_state() {
void MacroAssembler::push_FPU_state() {
subptr(rsp, FPUStateSizeInWords * wordSize);
-#ifndef _LP64
- fnsave(Address(rsp, 0));
- fwait();
-#else
fxsave(Address(rsp, 0));
-#endif // LP64
}
void MacroAssembler::push_IU_state() {
// Push flags first because pusha kills them
pushf();
// Make sure rsp stays 16-byte aligned
- LP64_ONLY(subq(rsp, 8));
+ subq(rsp, 8);
pusha();
}
void MacroAssembler::push_cont_fastpath() {
if (!Continuations::enabled()) return;
-#ifndef _LP64
- Register rthread = rax;
- Register rrealsp = rbx;
- push(rthread);
- push(rrealsp);
-
- get_thread(rthread);
-
- // The code below wants the original RSP.
- // Move it back after the pushes above.
- movptr(rrealsp, rsp);
- addptr(rrealsp, 2*wordSize);
-#else
- Register rthread = r15_thread;
- Register rrealsp = rsp;
-#endif
-
- Label done;
- cmpptr(rrealsp, Address(rthread, JavaThread::cont_fastpath_offset()));
- jccb(Assembler::belowEqual, done);
- movptr(Address(rthread, JavaThread::cont_fastpath_offset()), rrealsp);
- bind(done);
-
-#ifndef _LP64
- pop(rrealsp);
- pop(rthread);
-#endif
+ Label L_done;
+ cmpptr(rsp, Address(r15_thread, JavaThread::cont_fastpath_offset()));
+ jccb(Assembler::belowEqual, L_done);
+ movptr(Address(r15_thread, JavaThread::cont_fastpath_offset()), rsp);
+ bind(L_done);
}
void MacroAssembler::pop_cont_fastpath() {
if (!Continuations::enabled()) return;
-#ifndef _LP64
- Register rthread = rax;
- Register rrealsp = rbx;
- push(rthread);
- push(rrealsp);
-
- get_thread(rthread);
-
- // The code below wants the original RSP.
- // Move it back after the pushes above.
- movptr(rrealsp, rsp);
- addptr(rrealsp, 2*wordSize);
-#else
- Register rthread = r15_thread;
- Register rrealsp = rsp;
-#endif
-
- Label done;
- cmpptr(rrealsp, Address(rthread, JavaThread::cont_fastpath_offset()));
- jccb(Assembler::below, done);
- movptr(Address(rthread, JavaThread::cont_fastpath_offset()), 0);
- bind(done);
-
-#ifndef _LP64
- pop(rrealsp);
- pop(rthread);
-#endif
+ Label L_done;
+ cmpptr(rsp, Address(r15_thread, JavaThread::cont_fastpath_offset()));
+ jccb(Assembler::below, L_done);
+ movptr(Address(r15_thread, JavaThread::cont_fastpath_offset()), 0);
+ bind(L_done);
}
void MacroAssembler::inc_held_monitor_count() {
-#ifdef _LP64
incrementq(Address(r15_thread, JavaThread::held_monitor_count_offset()));
-#endif
}
void MacroAssembler::dec_held_monitor_count() {
-#ifdef _LP64
decrementq(Address(r15_thread, JavaThread::held_monitor_count_offset()));
-#endif
}
#ifdef ASSERT
void MacroAssembler::stop_if_in_cont(Register cont, const char* name) {
-#ifdef _LP64
Label no_cont;
movptr(cont, Address(r15_thread, JavaThread::cont_entry_offset()));
testl(cont, cont);
jcc(Assembler::zero, no_cont);
stop(name);
bind(no_cont);
-#else
- Unimplemented();
-#endif
}
#endif
-void MacroAssembler::reset_last_Java_frame(Register java_thread, bool clear_fp) { // determine java_thread register
- if (!java_thread->is_valid()) {
- java_thread = rdi;
- get_thread(java_thread);
- }
+void MacroAssembler::reset_last_Java_frame(bool clear_fp) { // determine java_thread register
// we must set sp to zero to clear frame
- movptr(Address(java_thread, JavaThread::last_Java_sp_offset()), NULL_WORD);
+ movptr(Address(r15_thread, JavaThread::last_Java_sp_offset()), NULL_WORD);
// must clear fp, so that compiled frames are not confused; it is
// possible that we need it only for debugging
if (clear_fp) {
- movptr(Address(java_thread, JavaThread::last_Java_fp_offset()), NULL_WORD);
+ movptr(Address(r15_thread, JavaThread::last_Java_fp_offset()), NULL_WORD);
}
// Always clear the pc because it could have been set by make_walkable()
- movptr(Address(java_thread, JavaThread::last_Java_pc_offset()), NULL_WORD);
+ movptr(Address(r15_thread, JavaThread::last_Java_pc_offset()), NULL_WORD);
vzeroupper();
}
-void MacroAssembler::restore_rax(Register tmp) {
- if (tmp == noreg) pop(rax);
- else if (tmp != rax) mov(rax, tmp);
-}
-
void MacroAssembler::round_to(Register reg, int modulus) {
addptr(reg, modulus - 1);
andptr(reg, -modulus);
}
-void MacroAssembler::save_rax(Register tmp) {
- if (tmp == noreg) push(rax);
- else if (tmp != rax) mov(tmp, rax);
-}
-
-void MacroAssembler::safepoint_poll(Label& slow_path, Register thread_reg, bool at_return, bool in_nmethod) {
+void MacroAssembler::safepoint_poll(Label& slow_path, bool at_return, bool in_nmethod) {
if (at_return) {
// Note that when in_nmethod is set, the stack pointer is incremented before the poll. Therefore,
// we may safely use rsp instead to perform the stack watermark check.
- cmpptr(in_nmethod ? rsp : rbp, Address(thread_reg, JavaThread::polling_word_offset()));
+ cmpptr(in_nmethod ? rsp : rbp, Address(r15_thread, JavaThread::polling_word_offset()));
jcc(Assembler::above, slow_path);
return;
}
- testb(Address(thread_reg, JavaThread::polling_word_offset()), SafepointMechanism::poll_bit());
+ testb(Address(r15_thread, JavaThread::polling_word_offset()), SafepointMechanism::poll_bit());
jcc(Assembler::notZero, slow_path); // handshake bit set implies poll
}
@@ -3219,69 +2472,51 @@ void MacroAssembler::safepoint_poll(Label& slow_path, Register thread_reg, bool
// When entering C land, the rbp, & rsp of the last Java frame have to be recorded
// in the (thread-local) JavaThread object. When leaving C land, the last Java fp
// has to be reset to 0. This is required to allow proper stack traversal.
-void MacroAssembler::set_last_Java_frame(Register java_thread,
- Register last_java_sp,
+void MacroAssembler::set_last_Java_frame(Register last_java_sp,
Register last_java_fp,
address last_java_pc,
Register rscratch) {
vzeroupper();
- // determine java_thread register
- if (!java_thread->is_valid()) {
- java_thread = rdi;
- get_thread(java_thread);
- }
// determine last_java_sp register
if (!last_java_sp->is_valid()) {
last_java_sp = rsp;
}
// last_java_fp is optional
if (last_java_fp->is_valid()) {
- movptr(Address(java_thread, JavaThread::last_Java_fp_offset()), last_java_fp);
+ movptr(Address(r15_thread, JavaThread::last_Java_fp_offset()), last_java_fp);
}
// last_java_pc is optional
if (last_java_pc != nullptr) {
- Address java_pc(java_thread,
+ Address java_pc(r15_thread,
JavaThread::frame_anchor_offset() + JavaFrameAnchor::last_Java_pc_offset());
lea(java_pc, InternalAddress(last_java_pc), rscratch);
}
- movptr(Address(java_thread, JavaThread::last_Java_sp_offset()), last_java_sp);
+ movptr(Address(r15_thread, JavaThread::last_Java_sp_offset()), last_java_sp);
}
-#ifdef _LP64
void MacroAssembler::set_last_Java_frame(Register last_java_sp,
Register last_java_fp,
Label &L,
Register scratch) {
lea(scratch, L);
movptr(Address(r15_thread, JavaThread::last_Java_pc_offset()), scratch);
- set_last_Java_frame(r15_thread, last_java_sp, last_java_fp, nullptr, scratch);
+ set_last_Java_frame(last_java_sp, last_java_fp, nullptr, scratch);
}
-#endif
void MacroAssembler::shlptr(Register dst, int imm8) {
- LP64_ONLY(shlq(dst, imm8)) NOT_LP64(shll(dst, imm8));
+ shlq(dst, imm8);
}
void MacroAssembler::shrptr(Register dst, int imm8) {
- LP64_ONLY(shrq(dst, imm8)) NOT_LP64(shrl(dst, imm8));
+ shrq(dst, imm8);
}
void MacroAssembler::sign_extend_byte(Register reg) {
- if (LP64_ONLY(true ||) (VM_Version::is_P6() && reg->has_byte_register())) {
- movsbl(reg, reg); // movsxb
- } else {
- shll(reg, 24);
- sarl(reg, 24);
- }
+ movsbl(reg, reg); // movsxb
}
void MacroAssembler::sign_extend_short(Register reg) {
- if (LP64_ONLY(true ||) VM_Version::is_P6()) {
- movswl(reg, reg); // movsxw
- } else {
- shll(reg, 16);
- sarl(reg, 16);
- }
+ movswl(reg, reg); // movsxw
}
void MacroAssembler::testl(Address dst, int32_t imm32) {
@@ -3305,8 +2540,6 @@ void MacroAssembler::testl(Register dst, AddressLiteral src) {
testl(dst, as_Address(src));
}
-#ifdef _LP64
-
void MacroAssembler::testq(Address dst, int32_t imm32) {
if (imm32 >= 0) {
testl(dst, imm32);
@@ -3323,8 +2556,6 @@ void MacroAssembler::testq(Register dst, int32_t imm32) {
}
}
-#endif
-
void MacroAssembler::pcmpeqb(XMMRegister dst, XMMRegister src) {
assert(((dst->encoding() < 16 && src->encoding() < 16) || VM_Version::supports_avx512vlbw()),"XMM register should be 0-15");
Assembler::pcmpeqb(dst, src);
@@ -4111,8 +3342,8 @@ void MacroAssembler::clear_jobject_tag(Register possibly_non_local) {
}
void MacroAssembler::resolve_jobject(Register value,
- Register thread,
Register tmp) {
+ Register thread = r15_thread;
assert_different_registers(value, thread, tmp);
Label done, tagged, weak_tagged;
testptr(value, value);
@@ -4121,7 +3352,7 @@ void MacroAssembler::resolve_jobject(Register value,
jcc(Assembler::notZero, tagged);
// Resolve local handle
- access_load_at(T_OBJECT, IN_NATIVE | AS_RAW, value, Address(value, 0), tmp, thread);
+ access_load_at(T_OBJECT, IN_NATIVE | AS_RAW, value, Address(value, 0), tmp);
verify_oop(value);
jmp(done);
@@ -4130,22 +3361,22 @@ void MacroAssembler::resolve_jobject(Register value,
jcc(Assembler::notZero, weak_tagged);
// Resolve global handle
- access_load_at(T_OBJECT, IN_NATIVE, value, Address(value, -JNIHandles::TypeTag::global), tmp, thread);
+ access_load_at(T_OBJECT, IN_NATIVE, value, Address(value, -JNIHandles::TypeTag::global), tmp);
verify_oop(value);
jmp(done);
bind(weak_tagged);
// Resolve jweak.
access_load_at(T_OBJECT, IN_NATIVE | ON_PHANTOM_OOP_REF,
- value, Address(value, -JNIHandles::TypeTag::weak_global), tmp, thread);
+ value, Address(value, -JNIHandles::TypeTag::weak_global), tmp);
verify_oop(value);
bind(done);
}
void MacroAssembler::resolve_global_jobject(Register value,
- Register thread,
Register tmp) {
+ Register thread = r15_thread;
assert_different_registers(value, thread, tmp);
Label done;
@@ -4163,23 +3394,23 @@ void MacroAssembler::resolve_global_jobject(Register value,
#endif
// Resolve global handle
- access_load_at(T_OBJECT, IN_NATIVE, value, Address(value, -JNIHandles::TypeTag::global), tmp, thread);
+ access_load_at(T_OBJECT, IN_NATIVE, value, Address(value, -JNIHandles::TypeTag::global), tmp);
verify_oop(value);
bind(done);
}
void MacroAssembler::subptr(Register dst, int32_t imm32) {
- LP64_ONLY(subq(dst, imm32)) NOT_LP64(subl(dst, imm32));
+ subq(dst, imm32);
}
// Force generation of a 4 byte immediate value even if it fits into 8bit
void MacroAssembler::subptr_imm32(Register dst, int32_t imm32) {
- LP64_ONLY(subq_imm32(dst, imm32)) NOT_LP64(subl_imm32(dst, imm32));
+ subq_imm32(dst, imm32);
}
void MacroAssembler::subptr(Register dst, Register src) {
- LP64_ONLY(subq(dst, src)) NOT_LP64(subl(dst, src));
+ subq(dst, src);
}
// C++ bool manipulation
@@ -4197,36 +3428,30 @@ void MacroAssembler::testbool(Register dst) {
}
void MacroAssembler::testptr(Register dst, Register src) {
- LP64_ONLY(testq(dst, src)) NOT_LP64(testl(dst, src));
+ testq(dst, src);
}
// Defines obj, preserves var_size_in_bytes, okay for t2 == var_size_in_bytes.
-void MacroAssembler::tlab_allocate(Register thread, Register obj,
+void MacroAssembler::tlab_allocate(Register obj,
Register var_size_in_bytes,
int con_size_in_bytes,
Register t1,
Register t2,
Label& slow_case) {
BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler();
- bs->tlab_allocate(this, thread, obj, var_size_in_bytes, con_size_in_bytes, t1, t2, slow_case);
+ bs->tlab_allocate(this, obj, var_size_in_bytes, con_size_in_bytes, t1, t2, slow_case);
}
RegSet MacroAssembler::call_clobbered_gp_registers() {
RegSet regs;
-#ifdef _LP64
regs += RegSet::of(rax, rcx, rdx);
#ifndef _WINDOWS
regs += RegSet::of(rsi, rdi);
#endif
regs += RegSet::range(r8, r11);
-#else
- regs += RegSet::of(rax, rcx, rdx);
-#endif
-#ifdef _LP64
if (UseAPX) {
regs += RegSet::range(r16, as_Register(Register::number_of_registers - 1));
}
-#endif
return regs;
}
@@ -4243,46 +3468,25 @@ XMMRegSet MacroAssembler::call_clobbered_xmm_registers() {
#endif
}
-static int FPUSaveAreaSize = align_up(108, StackAlignmentInBytes); // 108 bytes needed for FPU state by fsave/frstor
-
-#ifndef _LP64
-static bool use_x87_registers() { return UseSSE < 2; }
-#endif
-static bool use_xmm_registers() { return UseSSE >= 1; }
-
// C1 only ever uses the first double/float of the XMM register.
-static int xmm_save_size() { return UseSSE >= 2 ? sizeof(double) : sizeof(float); }
+static int xmm_save_size() { return sizeof(double); }
static void save_xmm_register(MacroAssembler* masm, int offset, XMMRegister reg) {
- if (UseSSE == 1) {
- masm->movflt(Address(rsp, offset), reg);
- } else {
- masm->movdbl(Address(rsp, offset), reg);
- }
+ masm->movdbl(Address(rsp, offset), reg);
}
static void restore_xmm_register(MacroAssembler* masm, int offset, XMMRegister reg) {
- if (UseSSE == 1) {
- masm->movflt(reg, Address(rsp, offset));
- } else {
- masm->movdbl(reg, Address(rsp, offset));
- }
+ masm->movdbl(reg, Address(rsp, offset));
}
static int register_section_sizes(RegSet gp_registers, XMMRegSet xmm_registers,
- bool save_fpu, int& gp_area_size,
- int& fp_area_size, int& xmm_area_size) {
+ bool save_fpu, int& gp_area_size, int& xmm_area_size) {
gp_area_size = align_up(gp_registers.size() * Register::max_slots_per_register * VMRegImpl::stack_slot_size,
StackAlignmentInBytes);
-#ifdef _LP64
- fp_area_size = 0;
-#else
- fp_area_size = (save_fpu && use_x87_registers()) ? FPUSaveAreaSize : 0;
-#endif
- xmm_area_size = (save_fpu && use_xmm_registers()) ? xmm_registers.size() * xmm_save_size() : 0;
+ xmm_area_size = save_fpu ? xmm_registers.size() * xmm_save_size() : 0;
- return gp_area_size + fp_area_size + xmm_area_size;
+ return gp_area_size + xmm_area_size;
}
void MacroAssembler::push_call_clobbered_registers_except(RegSet exclude, bool save_fpu) {
@@ -4291,22 +3495,15 @@ void MacroAssembler::push_call_clobbered_registers_except(RegSet exclude, bool s
RegSet gp_registers_to_push = call_clobbered_gp_registers() - exclude;
int gp_area_size;
- int fp_area_size;
int xmm_area_size;
int total_save_size = register_section_sizes(gp_registers_to_push, call_clobbered_xmm_registers(), save_fpu,
- gp_area_size, fp_area_size, xmm_area_size);
+ gp_area_size, xmm_area_size);
subptr(rsp, total_save_size);
push_set(gp_registers_to_push, 0);
-#ifndef _LP64
- if (save_fpu && use_x87_registers()) {
- fnsave(Address(rsp, gp_area_size));
- fwait();
- }
-#endif
- if (save_fpu && use_xmm_registers()) {
- push_set(call_clobbered_xmm_registers(), gp_area_size + fp_area_size);
+ if (save_fpu) {
+ push_set(call_clobbered_xmm_registers(), gp_area_size);
}
block_comment("push_call_clobbered_registers end");
@@ -4318,19 +3515,13 @@ void MacroAssembler::pop_call_clobbered_registers_except(RegSet exclude, bool re
RegSet gp_registers_to_pop = call_clobbered_gp_registers() - exclude;
int gp_area_size;
- int fp_area_size;
int xmm_area_size;
int total_save_size = register_section_sizes(gp_registers_to_pop, call_clobbered_xmm_registers(), restore_fpu,
- gp_area_size, fp_area_size, xmm_area_size);
+ gp_area_size, xmm_area_size);
- if (restore_fpu && use_xmm_registers()) {
- pop_set(call_clobbered_xmm_registers(), gp_area_size + fp_area_size);
+ if (restore_fpu) {
+ pop_set(call_clobbered_xmm_registers(), gp_area_size);
}
-#ifndef _LP64
- if (restore_fpu && use_x87_registers()) {
- frstor(Address(rsp, gp_area_size));
- }
-#endif
pop_set(gp_registers_to_pop, 0);
@@ -4430,27 +3621,12 @@ void MacroAssembler::zero_memory(Register address, Register length_in_bytes, int
shrptr(index, 2); // use 2 instructions to avoid partial flag stall
shrptr(index, 1);
}
-#ifndef _LP64
- // index could have not been a multiple of 8 (i.e., bit 2 was set)
- {
- Label even;
- // note: if index was a multiple of 8, then it cannot
- // be 0 now otherwise it must have been 0 before
- // => if it is even, we don't need to check for 0 again
- jcc(Assembler::carryClear, even);
- // clear topmost word (no jump would be needed if conditional assignment worked here)
- movptr(Address(address, index, Address::times_8, offset_in_bytes - 0*BytesPerWord), temp);
- // index could be 0 now, must check again
- jcc(Assembler::zero, done);
- bind(even);
- }
-#endif // !_LP64
+
// initialize remaining object fields: index is a multiple of 2 now
{
Label loop;
bind(loop);
movptr(Address(address, index, Address::times_8, offset_in_bytes - 1*BytesPerWord), temp);
- NOT_LP64(movptr(Address(address, index, Address::times_8, offset_in_bytes - 2*BytesPerWord), temp);)
decrement(index);
jcc(Assembler::notZero, loop);
}
@@ -4827,9 +4003,8 @@ void MacroAssembler::check_klass_subtype_slow_path_linear(Register sub_klass,
#ifndef PRODUCT
uint* pst_counter = &SharedRuntime::_partial_subtype_ctr;
ExternalAddress pst_counter_addr((address) pst_counter);
- NOT_LP64( incrementl(pst_counter_addr) );
- LP64_ONLY( lea(rcx, pst_counter_addr) );
- LP64_ONLY( incrementl(Address(rcx, 0)) );
+ lea(rcx, pst_counter_addr);
+ incrementl(Address(rcx, 0));
#endif //PRODUCT
// We will consult the secondary-super array.
@@ -4875,22 +4050,6 @@ void MacroAssembler::check_klass_subtype_slow_path_linear(Register sub_klass,
bind(L_fallthrough);
}
-#ifndef _LP64
-
-// 32-bit x86 only: always use the linear search.
-void MacroAssembler::check_klass_subtype_slow_path(Register sub_klass,
- Register super_klass,
- Register temp_reg,
- Register temp2_reg,
- Label* L_success,
- Label* L_failure,
- bool set_cond_codes) {
- check_klass_subtype_slow_path_linear
- (sub_klass, super_klass, temp_reg, temp2_reg, L_success, L_failure, set_cond_codes);
-}
-
-#else // _LP64
-
void MacroAssembler::check_klass_subtype_slow_path(Register sub_klass,
Register super_klass,
Register temp_reg,
@@ -5474,9 +4633,7 @@ void MacroAssembler::verify_secondary_supers_table(Register r_sub_klass,
#undef LOOKUP_SECONDARY_SUPERS_TABLE_REGISTERS
-#endif // LP64
-
-void MacroAssembler::clinit_barrier(Register klass, Register thread, Label* L_fast_path, Label* L_slow_path) {
+void MacroAssembler::clinit_barrier(Register klass, Label* L_fast_path, Label* L_slow_path) {
assert(L_fast_path != nullptr || L_slow_path != nullptr, "at least one is required");
Label L_fallthrough;
@@ -5492,7 +4649,7 @@ void MacroAssembler::clinit_barrier(Register klass, Register thread, Label* L_fa
jcc(Assembler::equal, *L_fast_path);
// Fast path check: current thread is initializer thread
- cmpptr(thread, Address(klass, InstanceKlass::init_thread_offset()));
+ cmpptr(r15_thread, Address(klass, InstanceKlass::init_thread_offset()));
if (L_slow_path == &L_fallthrough) {
jcc(Assembler::equal, *L_fast_path);
bind(*L_slow_path);
@@ -5530,9 +4687,7 @@ void MacroAssembler::_verify_oop(Register reg, const char* s, const char* file,
if (!VerifyOops) return;
BLOCK_COMMENT("verify_oop {");
-#ifdef _LP64
push(rscratch1);
-#endif
push(rax); // save rax
push(reg); // pass register argument
@@ -5590,9 +4745,7 @@ Address MacroAssembler::argument_address(RegisterOrConstant arg_slot,
void MacroAssembler::_verify_oop_addr(Address addr, const char* s, const char* file, int line) {
if (!VerifyOops) return;
-#ifdef _LP64
push(rscratch1);
-#endif
push(rax); // save rax,
// addr may contain rsp so we will have to adjust it based on the push
// we just did (and on 64 bit we do two pushes)
@@ -5600,7 +4753,7 @@ void MacroAssembler::_verify_oop_addr(Address addr, const char* s, const char* f
// stores rax into addr which is backwards of what was intended.
if (addr.uses(rsp)) {
lea(rax, addr);
- pushptr(Address(rax, LP64_ONLY(2 *) BytesPerWord));
+ pushptr(Address(rax, 2 * BytesPerWord));
} else {
pushptr(addr);
}
@@ -5627,27 +4780,23 @@ void MacroAssembler::verify_tlab() {
if (UseTLAB && VerifyOops) {
Label next, ok;
Register t1 = rsi;
- Register thread_reg = NOT_LP64(rbx) LP64_ONLY(r15_thread);
push(t1);
- NOT_LP64(push(thread_reg));
- NOT_LP64(get_thread(thread_reg));
- movptr(t1, Address(thread_reg, in_bytes(JavaThread::tlab_top_offset())));
- cmpptr(t1, Address(thread_reg, in_bytes(JavaThread::tlab_start_offset())));
+ movptr(t1, Address(r15_thread, in_bytes(JavaThread::tlab_top_offset())));
+ cmpptr(t1, Address(r15_thread, in_bytes(JavaThread::tlab_start_offset())));
jcc(Assembler::aboveEqual, next);
STOP("assert(top >= start)");
should_not_reach_here();
bind(next);
- movptr(t1, Address(thread_reg, in_bytes(JavaThread::tlab_end_offset())));
- cmpptr(t1, Address(thread_reg, in_bytes(JavaThread::tlab_top_offset())));
+ movptr(t1, Address(r15_thread, in_bytes(JavaThread::tlab_end_offset())));
+ cmpptr(t1, Address(r15_thread, in_bytes(JavaThread::tlab_top_offset())));
jcc(Assembler::aboveEqual, ok);
STOP("assert(top <= end)");
should_not_reach_here();
bind(ok);
- NOT_LP64(pop(thread_reg));
pop(t1);
}
#endif
@@ -5927,85 +5076,6 @@ void MacroAssembler::print_CPU_state() {
pop_CPU_state();
}
-
-#ifndef _LP64
-static bool _verify_FPU(int stack_depth, char* s, CPU_State* state) {
- static int counter = 0;
- FPU_State* fs = &state->_fpu_state;
- counter++;
- // For leaf calls, only verify that the top few elements remain empty.
- // We only need 1 empty at the top for C2 code.
- if( stack_depth < 0 ) {
- if( fs->tag_for_st(7) != 3 ) {
- printf("FPR7 not empty\n");
- state->print();
- assert(false, "error");
- return false;
- }
- return true; // All other stack states do not matter
- }
-
- assert((fs->_control_word._value & 0xffff) == StubRoutines::x86::fpu_cntrl_wrd_std(),
- "bad FPU control word");
-
- // compute stack depth
- int i = 0;
- while (i < FPU_State::number_of_registers && fs->tag_for_st(i) < 3) i++;
- int d = i;
- while (i < FPU_State::number_of_registers && fs->tag_for_st(i) == 3) i++;
- // verify findings
- if (i != FPU_State::number_of_registers) {
- // stack not contiguous
- printf("%s: stack not contiguous at ST%d\n", s, i);
- state->print();
- assert(false, "error");
- return false;
- }
- // check if computed stack depth corresponds to expected stack depth
- if (stack_depth < 0) {
- // expected stack depth is -stack_depth or less
- if (d > -stack_depth) {
- // too many elements on the stack
- printf("%s: <= %d stack elements expected but found %d\n", s, -stack_depth, d);
- state->print();
- assert(false, "error");
- return false;
- }
- } else {
- // expected stack depth is stack_depth
- if (d != stack_depth) {
- // wrong stack depth
- printf("%s: %d stack elements expected but found %d\n", s, stack_depth, d);
- state->print();
- assert(false, "error");
- return false;
- }
- }
- // everything is cool
- return true;
-}
-
-void MacroAssembler::verify_FPU(int stack_depth, const char* s) {
- if (!VerifyFPU) return;
- push_CPU_state();
- push(rsp); // pass CPU state
- ExternalAddress msg((address) s);
- // pass message string s
- pushptr(msg.addr(), noreg);
- push(stack_depth); // pass stack depth
- call(RuntimeAddress(CAST_FROM_FN_PTR(address, _verify_FPU)));
- addptr(rsp, 3 * wordSize); // discard arguments
- // check for error
- { Label L;
- testl(rax, rax);
- jcc(Assembler::notZero, L);
- int3(); // break if error condition
- bind(L);
- }
- pop_CPU_state();
-}
-#endif // _LP64
-
void MacroAssembler::restore_cpu_control_state_after_jni(Register rscratch) {
// Either restore the MXCSR register after returning from the JNI Call
// or verify that it wasn't changed (with -Xcheck:jni flag).
@@ -6018,14 +5088,6 @@ void MacroAssembler::restore_cpu_control_state_after_jni(Register rscratch) {
}
// Clear upper bits of YMM registers to avoid SSE <-> AVX transition penalty.
vzeroupper();
-
-#ifndef _LP64
- // Either restore the x87 floating pointer control word after returning
- // from the JNI call or verify that it wasn't changed.
- if (CheckJNICalls) {
- call(RuntimeAddress(StubRoutines::x86::verify_fpu_cntrl_wrd_entry()));
- }
-#endif // _LP64
}
// ((OopHandle)result).resolve();
@@ -6036,7 +5098,7 @@ void MacroAssembler::resolve_oop_handle(Register result, Register tmp) {
// Only IN_HEAP loads require a thread_tmp register
// OopHandle::resolve is an indirection like jobject.
access_load_at(T_OBJECT, IN_NATIVE,
- result, Address(result, 0), tmp, /*tmp_thread*/noreg);
+ result, Address(result, 0), tmp);
}
// ((WeakHandle)result).resolve();
@@ -6052,7 +5114,7 @@ void MacroAssembler::resolve_weak_handle(Register rresult, Register rtmp) {
// Only IN_HEAP loads require a thread_tmp register
// WeakHandle::resolve is an indirection like jweak.
access_load_at(T_OBJECT, IN_NATIVE | ON_PHANTOM_OOP_REF,
- rresult, Address(rresult, 0), rtmp, /*tmp_thread*/noreg);
+ rresult, Address(rresult, 0), rtmp);
bind(resolved);
}
@@ -6075,27 +5137,23 @@ void MacroAssembler::load_method_holder(Register holder, Register method) {
movptr(holder, Address(holder, ConstantPool::pool_holder_offset())); // InstanceKlass*
}
-#ifdef _LP64
void MacroAssembler::load_narrow_klass_compact(Register dst, Register src) {
assert(UseCompactObjectHeaders, "expect compact object headers");
movq(dst, Address(src, oopDesc::mark_offset_in_bytes()));
shrq(dst, markWord::klass_shift);
}
-#endif
void MacroAssembler::load_klass(Register dst, Register src, Register tmp) {
assert_different_registers(src, tmp);
assert_different_registers(dst, tmp);
-#ifdef _LP64
+
if (UseCompactObjectHeaders) {
load_narrow_klass_compact(dst, src);
decode_klass_not_null(dst, tmp);
} else if (UseCompressedClassPointers) {
movl(dst, Address(src, oopDesc::klass_offset_in_bytes()));
decode_klass_not_null(dst, tmp);
- } else
-#endif
- {
+ } else {
movptr(dst, Address(src, oopDesc::klass_offset_in_bytes()));
}
}
@@ -6104,17 +5162,15 @@ void MacroAssembler::store_klass(Register dst, Register src, Register tmp) {
assert(!UseCompactObjectHeaders, "not with compact headers");
assert_different_registers(src, tmp);
assert_different_registers(dst, tmp);
-#ifdef _LP64
if (UseCompressedClassPointers) {
encode_klass_not_null(src, tmp);
movl(Address(dst, oopDesc::klass_offset_in_bytes()), src);
- } else
-#endif
+ } else {
movptr(Address(dst, oopDesc::klass_offset_in_bytes()), src);
+ }
}
void MacroAssembler::cmp_klass(Register klass, Register obj, Register tmp) {
-#ifdef _LP64
if (UseCompactObjectHeaders) {
assert(tmp != noreg, "need tmp");
assert_different_registers(klass, obj, tmp);
@@ -6122,15 +5178,12 @@ void MacroAssembler::cmp_klass(Register klass, Register obj, Register tmp) {
cmpl(klass, tmp);
} else if (UseCompressedClassPointers) {
cmpl(klass, Address(obj, oopDesc::klass_offset_in_bytes()));
- } else
-#endif
- {
+ } else {
cmpptr(klass, Address(obj, oopDesc::klass_offset_in_bytes()));
}
}
void MacroAssembler::cmp_klasses_from_objects(Register obj1, Register obj2, Register tmp1, Register tmp2) {
-#ifdef _LP64
if (UseCompactObjectHeaders) {
assert(tmp2 != noreg, "need tmp2");
assert_different_registers(obj1, obj2, tmp1, tmp2);
@@ -6140,23 +5193,21 @@ void MacroAssembler::cmp_klasses_from_objects(Register obj1, Register obj2, Regi
} else if (UseCompressedClassPointers) {
movl(tmp1, Address(obj1, oopDesc::klass_offset_in_bytes()));
cmpl(tmp1, Address(obj2, oopDesc::klass_offset_in_bytes()));
- } else
-#endif
- {
+ } else {
movptr(tmp1, Address(obj1, oopDesc::klass_offset_in_bytes()));
cmpptr(tmp1, Address(obj2, oopDesc::klass_offset_in_bytes()));
}
}
void MacroAssembler::access_load_at(BasicType type, DecoratorSet decorators, Register dst, Address src,
- Register tmp1, Register thread_tmp) {
+ Register tmp1) {
BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler();
decorators = AccessInternal::decorator_fixup(decorators, type);
bool as_raw = (decorators & AS_RAW) != 0;
if (as_raw) {
- bs->BarrierSetAssembler::load_at(this, decorators, type, dst, src, tmp1, thread_tmp);
+ bs->BarrierSetAssembler::load_at(this, decorators, type, dst, src, tmp1);
} else {
- bs->load_at(this, decorators, type, dst, src, tmp1, thread_tmp);
+ bs->load_at(this, decorators, type, dst, src, tmp1);
}
}
@@ -6172,15 +5223,13 @@ void MacroAssembler::access_store_at(BasicType type, DecoratorSet decorators, Ad
}
}
-void MacroAssembler::load_heap_oop(Register dst, Address src, Register tmp1,
- Register thread_tmp, DecoratorSet decorators) {
- access_load_at(T_OBJECT, IN_HEAP | decorators, dst, src, tmp1, thread_tmp);
+void MacroAssembler::load_heap_oop(Register dst, Address src, Register tmp1, DecoratorSet decorators) {
+ access_load_at(T_OBJECT, IN_HEAP | decorators, dst, src, tmp1);
}
// Doesn't do verification, generates fixed size code
-void MacroAssembler::load_heap_oop_not_null(Register dst, Address src, Register tmp1,
- Register thread_tmp, DecoratorSet decorators) {
- access_load_at(T_OBJECT, IN_HEAP | IS_NOT_NULL | decorators, dst, src, tmp1, thread_tmp);
+void MacroAssembler::load_heap_oop_not_null(Register dst, Address src, Register tmp1, DecoratorSet decorators) {
+ access_load_at(T_OBJECT, IN_HEAP | IS_NOT_NULL | decorators, dst, src, tmp1);
}
void MacroAssembler::store_heap_oop(Address dst, Register val, Register tmp1,
@@ -6193,7 +5242,6 @@ void MacroAssembler::store_heap_oop_null(Address dst) {
access_store_at(T_OBJECT, IN_HEAP, dst, noreg, noreg, noreg, noreg);
}
-#ifdef _LP64
void MacroAssembler::store_klass_gap(Register dst, Register src) {
assert(!UseCompactObjectHeaders, "Don't use with compact headers");
if (UseCompressedClassPointers) {
@@ -6515,8 +5563,6 @@ void MacroAssembler::reinit_heapbase() {
}
}
-#endif // _LP64
-
#if COMPILER2_OR_JVMCI
// clear memory of size 'cnt' qwords, starting at 'base' using XMM/YMM/ZMM registers
@@ -6695,8 +5741,6 @@ void MacroAssembler::clear_mem(Register base, Register cnt, Register tmp, XMMReg
cmpptr(cnt, InitArrayShortSize/BytesPerLong);
jccb(Assembler::greater, LONG);
- NOT_LP64(shlptr(cnt, 1);) // convert to number of 32-bit words for 32-bit VM
-
decrement(cnt);
jccb(Assembler::negative, DONE); // Zero length
@@ -6717,7 +5761,6 @@ void MacroAssembler::clear_mem(Register base, Register cnt, Register tmp, XMMReg
} else if (UseXMMForObjInit) {
xmm_clear_mem(base, cnt, tmp, xtmp, mask);
} else {
- NOT_LP64(shlptr(cnt, 1);) // convert to number of 32-bit words for 32-bit VM
rep_stos();
}
@@ -6735,7 +5778,7 @@ void MacroAssembler::generate_fill(BasicType t, bool aligned,
Label L_exit;
Label L_fill_2_bytes, L_fill_4_bytes;
-#if defined(COMPILER2) && defined(_LP64)
+#if defined(COMPILER2)
if(MaxVectorSize >=32 &&
VM_Version::supports_avx512vlbw() &&
VM_Version::supports_bmi2()) {
@@ -6796,39 +5839,7 @@ void MacroAssembler::generate_fill(BasicType t, bool aligned,
subptr(count, 1<<(shift-1));
BIND(L_skip_align2);
}
- if (UseSSE < 2) {
- Label L_fill_32_bytes_loop, L_check_fill_8_bytes, L_fill_8_bytes_loop, L_fill_8_bytes;
- // Fill 32-byte chunks
- subptr(count, 8 << shift);
- jcc(Assembler::less, L_check_fill_8_bytes);
- align(16);
-
- BIND(L_fill_32_bytes_loop);
-
- for (int i = 0; i < 32; i += 4) {
- movl(Address(to, i), value);
- }
-
- addptr(to, 32);
- subptr(count, 8 << shift);
- jcc(Assembler::greaterEqual, L_fill_32_bytes_loop);
- BIND(L_check_fill_8_bytes);
- addptr(count, 8 << shift);
- jccb(Assembler::zero, L_exit);
- jmpb(L_fill_8_bytes);
-
- //
- // length is too short, just fill qwords
- //
- BIND(L_fill_8_bytes_loop);
- movl(Address(to, 0), value);
- movl(Address(to, 4), value);
- addptr(to, 8);
- BIND(L_fill_8_bytes);
- subptr(count, 1 << (shift + 1));
- jcc(Assembler::greaterEqual, L_fill_8_bytes_loop);
- // fall through to fill 4 bytes
- } else {
+ {
Label L_fill_32_bytes;
if (!UseUnalignedLoadStores) {
// align to 8 bytes, we know we are 4 byte aligned to start
@@ -6840,7 +5851,6 @@ void MacroAssembler::generate_fill(BasicType t, bool aligned,
}
BIND(L_fill_32_bytes);
{
- assert( UseSSE >= 2, "supported cpu only" );
Label L_fill_32_bytes_loop, L_check_fill_8_bytes, L_fill_8_bytes_loop, L_fill_8_bytes;
movdl(xtmp, value);
if (UseAVX >= 2 && UseUnalignedLoadStores) {
@@ -7148,7 +6158,6 @@ void MacroAssembler::encode_iso_array(Register src, Register dst, Register len,
bind(L_done);
}
-#ifdef _LP64
/**
* Helper for multiply_to_len().
*/
@@ -8323,7 +7332,6 @@ void MacroAssembler::mul_add(Register out, Register in, Register offs,
pop(tmp2);
pop(tmp1);
}
-#endif
/**
* Emits code to update CRC-32 with a byte value according to constants in table
@@ -8546,7 +7554,6 @@ void MacroAssembler::kernel_crc32(Register crc, Register buf, Register len, Regi
notl(crc); // ~c
}
-#ifdef _LP64
// Helper function for AVX 512 CRC32
// Fold 512-bit data chunks
void MacroAssembler::fold512bit_crc32_avx512(XMMRegister xcrc, XMMRegister xK, XMMRegister xtmp, Register buf,
@@ -9064,155 +8071,7 @@ void MacroAssembler::crc32c_proc_chunk(uint32_t size, uint32_t const_or_pre_comp
bind(L_exit);
}
-#else
-void MacroAssembler::crc32c_ipl_alg4(Register in_out, uint32_t n,
- Register tmp1, Register tmp2, Register tmp3,
- XMMRegister xtmp1, XMMRegister xtmp2) {
- lea(tmp3, ExternalAddress(StubRoutines::crc32c_table_addr()));
- if (n > 0) {
- addl(tmp3, n * 256 * 8);
- }
- // Q1 = TABLEExt[n][B & 0xFF];
- movl(tmp1, in_out);
- andl(tmp1, 0x000000FF);
- shll(tmp1, 3);
- addl(tmp1, tmp3);
- movq(xtmp1, Address(tmp1, 0));
-
- // Q2 = TABLEExt[n][B >> 8 & 0xFF];
- movl(tmp2, in_out);
- shrl(tmp2, 8);
- andl(tmp2, 0x000000FF);
- shll(tmp2, 3);
- addl(tmp2, tmp3);
- movq(xtmp2, Address(tmp2, 0));
-
- psllq(xtmp2, 8);
- pxor(xtmp1, xtmp2);
-
- // Q3 = TABLEExt[n][B >> 16 & 0xFF];
- movl(tmp2, in_out);
- shrl(tmp2, 16);
- andl(tmp2, 0x000000FF);
- shll(tmp2, 3);
- addl(tmp2, tmp3);
- movq(xtmp2, Address(tmp2, 0));
-
- psllq(xtmp2, 16);
- pxor(xtmp1, xtmp2);
-
- // Q4 = TABLEExt[n][B >> 24 & 0xFF];
- shrl(in_out, 24);
- andl(in_out, 0x000000FF);
- shll(in_out, 3);
- addl(in_out, tmp3);
- movq(xtmp2, Address(in_out, 0));
-
- psllq(xtmp2, 24);
- pxor(xtmp1, xtmp2); // Result in CXMM
- // return Q1 ^ Q2 << 8 ^ Q3 << 16 ^ Q4 << 24;
-}
-
-void MacroAssembler::crc32c_pclmulqdq(XMMRegister w_xtmp1,
- Register in_out,
- uint32_t const_or_pre_comp_const_index, bool is_pclmulqdq_supported,
- XMMRegister w_xtmp2,
- Register tmp1,
- Register n_tmp2, Register n_tmp3) {
- if (is_pclmulqdq_supported) {
- movdl(w_xtmp1, in_out);
-
- movl(tmp1, const_or_pre_comp_const_index);
- movdl(w_xtmp2, tmp1);
- pclmulqdq(w_xtmp1, w_xtmp2, 0);
- // Keep result in XMM since GPR is 32 bit in length
- } else {
- crc32c_ipl_alg4(in_out, const_or_pre_comp_const_index, tmp1, n_tmp2, n_tmp3, w_xtmp1, w_xtmp2);
- }
-}
-
-void MacroAssembler::crc32c_rec_alt2(uint32_t const_or_pre_comp_const_index_u1, uint32_t const_or_pre_comp_const_index_u2, bool is_pclmulqdq_supported, Register in_out, Register in1, Register in2,
- XMMRegister w_xtmp1, XMMRegister w_xtmp2, XMMRegister w_xtmp3,
- Register tmp1, Register tmp2,
- Register n_tmp3) {
- crc32c_pclmulqdq(w_xtmp1, in_out, const_or_pre_comp_const_index_u1, is_pclmulqdq_supported, w_xtmp3, tmp1, tmp2, n_tmp3);
- crc32c_pclmulqdq(w_xtmp2, in1, const_or_pre_comp_const_index_u2, is_pclmulqdq_supported, w_xtmp3, tmp1, tmp2, n_tmp3);
-
- psllq(w_xtmp1, 1);
- movdl(tmp1, w_xtmp1);
- psrlq(w_xtmp1, 32);
- movdl(in_out, w_xtmp1);
-
- xorl(tmp2, tmp2);
- crc32(tmp2, tmp1, 4);
- xorl(in_out, tmp2);
-
- psllq(w_xtmp2, 1);
- movdl(tmp1, w_xtmp2);
- psrlq(w_xtmp2, 32);
- movdl(in1, w_xtmp2);
-
- xorl(tmp2, tmp2);
- crc32(tmp2, tmp1, 4);
- xorl(in1, tmp2);
- xorl(in_out, in1);
- xorl(in_out, in2);
-}
-
-void MacroAssembler::crc32c_proc_chunk(uint32_t size, uint32_t const_or_pre_comp_const_index_u1, uint32_t const_or_pre_comp_const_index_u2, bool is_pclmulqdq_supported,
- Register in_out1, Register in_out2, Register in_out3,
- Register tmp1, Register tmp2, Register tmp3,
- XMMRegister w_xtmp1, XMMRegister w_xtmp2, XMMRegister w_xtmp3,
- Register tmp4, Register tmp5,
- Register n_tmp6) {
- Label L_processPartitions;
- Label L_processPartition;
- Label L_exit;
-
- bind(L_processPartitions);
- cmpl(in_out1, 3 * size);
- jcc(Assembler::less, L_exit);
- xorl(tmp1, tmp1);
- xorl(tmp2, tmp2);
- movl(tmp3, in_out2);
- addl(tmp3, size);
-
- bind(L_processPartition);
- crc32(in_out3, Address(in_out2, 0), 4);
- crc32(tmp1, Address(in_out2, size), 4);
- crc32(tmp2, Address(in_out2, size*2), 4);
- crc32(in_out3, Address(in_out2, 0+4), 4);
- crc32(tmp1, Address(in_out2, size+4), 4);
- crc32(tmp2, Address(in_out2, size*2+4), 4);
- addl(in_out2, 8);
- cmpl(in_out2, tmp3);
- jcc(Assembler::less, L_processPartition);
-
- push(tmp3);
- push(in_out1);
- push(in_out2);
- tmp4 = tmp3;
- tmp5 = in_out1;
- n_tmp6 = in_out2;
-
- crc32c_rec_alt2(const_or_pre_comp_const_index_u1, const_or_pre_comp_const_index_u2, is_pclmulqdq_supported, in_out3, tmp1, tmp2,
- w_xtmp1, w_xtmp2, w_xtmp3,
- tmp4, tmp5,
- n_tmp6);
-
- pop(in_out2);
- pop(in_out1);
- pop(tmp3);
-
- addl(in_out2, 2 * size);
- subl(in_out1, 3 * size);
- jmp(L_processPartitions);
-
- bind(L_exit);
-}
-#endif //LP64
-#ifdef _LP64
// Algorithm 2: Pipelined usage of the CRC32 instruction.
// Input: A buffer I of L bytes.
// Output: the CRC32C value of the buffer.
@@ -9304,84 +8163,6 @@ void MacroAssembler::crc32c_ipl_alg2_alt2(Register in_out, Register in1, Registe
BIND(L_exit);
}
-#else
-void MacroAssembler::crc32c_ipl_alg2_alt2(Register in_out, Register in1, Register in2,
- Register tmp1, Register tmp2, Register tmp3,
- Register tmp4, Register tmp5, Register tmp6,
- XMMRegister w_xtmp1, XMMRegister w_xtmp2, XMMRegister w_xtmp3,
- bool is_pclmulqdq_supported) {
- uint32_t const_or_pre_comp_const_index[CRC32C_NUM_PRECOMPUTED_CONSTANTS];
- Label L_wordByWord;
- Label L_byteByByteProlog;
- Label L_byteByByte;
- Label L_exit;
-
- if (is_pclmulqdq_supported) {
- const_or_pre_comp_const_index[1] = *(uint32_t *)StubRoutines::crc32c_table_addr();
- const_or_pre_comp_const_index[0] = *((uint32_t *)StubRoutines::crc32c_table_addr() + 1);
-
- const_or_pre_comp_const_index[3] = *((uint32_t *)StubRoutines::crc32c_table_addr() + 2);
- const_or_pre_comp_const_index[2] = *((uint32_t *)StubRoutines::crc32c_table_addr() + 3);
-
- const_or_pre_comp_const_index[5] = *((uint32_t *)StubRoutines::crc32c_table_addr() + 4);
- const_or_pre_comp_const_index[4] = *((uint32_t *)StubRoutines::crc32c_table_addr() + 5);
- } else {
- const_or_pre_comp_const_index[0] = 1;
- const_or_pre_comp_const_index[1] = 0;
-
- const_or_pre_comp_const_index[2] = 3;
- const_or_pre_comp_const_index[3] = 2;
-
- const_or_pre_comp_const_index[4] = 5;
- const_or_pre_comp_const_index[5] = 4;
- }
- crc32c_proc_chunk(CRC32C_HIGH, const_or_pre_comp_const_index[0], const_or_pre_comp_const_index[1], is_pclmulqdq_supported,
- in2, in1, in_out,
- tmp1, tmp2, tmp3,
- w_xtmp1, w_xtmp2, w_xtmp3,
- tmp4, tmp5,
- tmp6);
- crc32c_proc_chunk(CRC32C_MIDDLE, const_or_pre_comp_const_index[2], const_or_pre_comp_const_index[3], is_pclmulqdq_supported,
- in2, in1, in_out,
- tmp1, tmp2, tmp3,
- w_xtmp1, w_xtmp2, w_xtmp3,
- tmp4, tmp5,
- tmp6);
- crc32c_proc_chunk(CRC32C_LOW, const_or_pre_comp_const_index[4], const_or_pre_comp_const_index[5], is_pclmulqdq_supported,
- in2, in1, in_out,
- tmp1, tmp2, tmp3,
- w_xtmp1, w_xtmp2, w_xtmp3,
- tmp4, tmp5,
- tmp6);
- movl(tmp1, in2);
- andl(tmp1, 0x00000007);
- negl(tmp1);
- addl(tmp1, in2);
- addl(tmp1, in1);
-
- BIND(L_wordByWord);
- cmpl(in1, tmp1);
- jcc(Assembler::greaterEqual, L_byteByByteProlog);
- crc32(in_out, Address(in1,0), 4);
- addl(in1, 4);
- jmp(L_wordByWord);
-
- BIND(L_byteByByteProlog);
- andl(in2, 0x00000007);
- movl(tmp2, 1);
-
- BIND(L_byteByByte);
- cmpl(tmp2, in2);
- jccb(Assembler::greater, L_exit);
- movb(tmp1, Address(in1, 0));
- crc32(in_out, tmp1, 1);
- incl(in1);
- incl(tmp2);
- jmp(L_byteByByte);
-
- BIND(L_exit);
-}
-#endif // LP64
#undef BIND
#undef BLOCK_COMMENT
@@ -10375,7 +9156,6 @@ void MacroAssembler::fill64(Register dst, int disp, XMMRegister xmm, bool use64b
fill64(Address(dst, disp), xmm, use64byteVector);
}
-#ifdef _LP64
void MacroAssembler::generate_fill_avx3(BasicType type, Register to, Register value,
Register count, Register rtmp, XMMRegister xtmp) {
Label L_exit;
@@ -10552,11 +9332,9 @@ void MacroAssembler::generate_fill_avx3(BasicType type, Register to, Register va
}
bind(L_exit);
}
-#endif
#endif //COMPILER2_OR_JVMCI
-#ifdef _LP64
void MacroAssembler::convert_f2i(Register dst, XMMRegister src) {
Label done;
cvttss2sil(dst, src);
@@ -10720,8 +9498,6 @@ void MacroAssembler::cache_wbsync(bool is_pre)
}
}
-#endif // _LP64
-
Assembler::Condition MacroAssembler::negate_condition(Assembler::Condition cond) {
switch (cond) {
// Note some conditions are synonyms for others
@@ -10746,33 +9522,29 @@ Assembler::Condition MacroAssembler::negate_condition(Assembler::Condition cond)
}
// This is simply a call to Thread::current()
-void MacroAssembler::get_thread(Register thread) {
+void MacroAssembler::get_thread_slow(Register thread) {
if (thread != rax) {
push(rax);
}
- LP64_ONLY(push(rdi);)
- LP64_ONLY(push(rsi);)
+ push(rdi);
+ push(rsi);
push(rdx);
push(rcx);
-#ifdef _LP64
push(r8);
push(r9);
push(r10);
push(r11);
-#endif
MacroAssembler::call_VM_leaf_base(CAST_FROM_FN_PTR(address, Thread::current), 0);
-#ifdef _LP64
pop(r11);
pop(r10);
pop(r9);
pop(r8);
-#endif
pop(rcx);
pop(rdx);
- LP64_ONLY(pop(rsi);)
- LP64_ONLY(pop(rdi);)
+ pop(rsi);
+ pop(rdi);
if (thread != rax) {
mov(thread, rax);
pop(rax);
@@ -10801,7 +9573,9 @@ void MacroAssembler::check_stack_alignment(Register sp, const char* msg, unsigne
// reg_rax: rax
// thread: the thread which attempts to lock obj
// tmp: a temporary register
-void MacroAssembler::lightweight_lock(Register basic_lock, Register obj, Register reg_rax, Register thread, Register tmp, Label& slow) {
+void MacroAssembler::lightweight_lock(Register basic_lock, Register obj, Register reg_rax, Register tmp, Label& slow) {
+ Register thread = r15_thread;
+
assert(reg_rax == rax, "");
assert_different_registers(basic_lock, obj, reg_rax, thread, tmp);
@@ -10813,10 +9587,16 @@ void MacroAssembler::lightweight_lock(Register basic_lock, Register obj, Registe
movptr(reg_rax, Address(obj, oopDesc::mark_offset_in_bytes()));
if (UseObjectMonitorTable) {
- // Clear cache in case fast locking succeeds.
+ // Clear cache in case fast locking succeeds or we need to take the slow-path.
movptr(Address(basic_lock, BasicObjectLock::lock_offset() + in_ByteSize((BasicLock::object_monitor_cache_offset_in_bytes()))), 0);
}
+ if (DiagnoseSyncOnValueBasedClasses != 0) {
+ load_klass(tmp, obj, rscratch1);
+ testb(Address(tmp, Klass::misc_flags_offset()), KlassFlags::_misc_is_value_based_class);
+ jcc(Assembler::notZero, slow);
+ }
+
// Load top.
movl(top, Address(thread, JavaThread::lock_stack_top_offset()));
@@ -10855,7 +9635,9 @@ void MacroAssembler::lightweight_lock(Register basic_lock, Register obj, Registe
// reg_rax: rax
// thread: the thread
// tmp: a temporary register
-void MacroAssembler::lightweight_unlock(Register obj, Register reg_rax, Register thread, Register tmp, Label& slow) {
+void MacroAssembler::lightweight_unlock(Register obj, Register reg_rax, Register tmp, Label& slow) {
+ Register thread = r15_thread;
+
assert(reg_rax == rax, "");
assert_different_registers(obj, reg_rax, thread, tmp);
@@ -10907,7 +9689,6 @@ void MacroAssembler::lightweight_unlock(Register obj, Register reg_rax, Register
bind(unlocked);
}
-#ifdef _LP64
// Saves legacy GPRs state on stack.
void MacroAssembler::save_legacy_gprs() {
subq(rsp, 16 * wordSize);
@@ -10956,4 +9737,3 @@ void MacroAssembler::setcc(Assembler::Condition comparison, Register dst) {
movzbl(dst, dst);
}
}
-#endif
diff --git a/src/hotspot/cpu/x86/macroAssembler_x86.hpp b/src/hotspot/cpu/x86/macroAssembler_x86.hpp
index 36ef3a69d49d4..efd1a4c154f1c 100644
--- a/src/hotspot/cpu/x86/macroAssembler_x86.hpp
+++ b/src/hotspot/cpu/x86/macroAssembler_x86.hpp
@@ -59,13 +59,10 @@ class MacroAssembler: public Assembler {
// may customize this version by overriding it for its purposes (e.g., to save/restore
// additional registers when doing a VM call).
//
- // If no java_thread register is specified (noreg) than rdi will be used instead. call_VM_base
- // returns the register which contains the thread upon return. If a thread register has been
- // specified, the return value will correspond to that register. If no last_java_sp is specified
- // (noreg) than rsp will be used instead.
+ // call_VM_base returns the register which contains the thread upon return.
+ // If no last_java_sp is specified (noreg) than rsp will be used instead.
virtual void call_VM_base( // returns the register containing the thread upon return
Register oop_result, // where an oop-result ends up if any; use noreg otherwise
- Register java_thread, // the thread if computed before ; use noreg otherwise
Register last_java_sp, // to set up last_Java_frame in stubs; use noreg otherwise
address entry_point, // the entry point
int number_of_arguments, // the number of arguments (w/o thread) to pop after the call
@@ -74,19 +71,14 @@ class MacroAssembler: public Assembler {
void call_VM_helper(Register oop_result, address entry_point, int number_of_arguments, bool check_exceptions = true);
- // helpers for FPU flag access
- // tmp is a temporary register, if none is available use noreg
- void save_rax (Register tmp);
- void restore_rax(Register tmp);
-
public:
MacroAssembler(CodeBuffer* code) : Assembler(code) {}
// These routines should emit JVMTI PopFrame and ForceEarlyReturn handling code.
// The implementation is only non-empty for the InterpreterMacroAssembler,
// as only the interpreter handles PopFrame and ForceEarlyReturn requests.
- virtual void check_and_handle_popframe(Register java_thread);
- virtual void check_and_handle_earlyret(Register java_thread);
+ virtual void check_and_handle_popframe();
+ virtual void check_and_handle_earlyret();
Address as_Address(AddressLiteral adr);
Address as_Address(ArrayAddress adr, Register rscratch);
@@ -148,10 +140,10 @@ class MacroAssembler: public Assembler {
// Support for inc/dec with optimal instruction selection depending on value
- void increment(Register reg, int value = 1) { LP64_ONLY(incrementq(reg, value)) NOT_LP64(incrementl(reg, value)) ; }
- void decrement(Register reg, int value = 1) { LP64_ONLY(decrementq(reg, value)) NOT_LP64(decrementl(reg, value)) ; }
- void increment(Address dst, int value = 1) { LP64_ONLY(incrementq(dst, value)) NOT_LP64(incrementl(dst, value)) ; }
- void decrement(Address dst, int value = 1) { LP64_ONLY(decrementq(dst, value)) NOT_LP64(decrementl(dst, value)) ; }
+ void increment(Register reg, int value = 1) { incrementq(reg, value); }
+ void decrement(Register reg, int value = 1) { decrementq(reg, value); }
+ void increment(Address dst, int value = 1) { incrementq(dst, value); }
+ void decrement(Address dst, int value = 1) { decrementq(dst, value); }
void decrementl(Address dst, int value = 1);
void decrementl(Register reg, int value = 1);
@@ -224,11 +216,11 @@ class MacroAssembler: public Assembler {
void enter();
void leave();
- // Support for getting the JavaThread pointer (i.e.; a reference to thread-local information)
- // The pointer will be loaded into the thread register.
- void get_thread(Register thread);
+ // Support for getting the JavaThread pointer (i.e.; a reference to thread-local information).
+ // The pointer will be loaded into the thread register. This is a slow version that does native call.
+ // Normally, JavaThread pointer is available in r15_thread, use that where possible.
+ void get_thread_slow(Register thread);
-#ifdef _LP64
// Support for argument shuffling
// bias in bytes
@@ -244,7 +236,6 @@ class MacroAssembler: public Assembler {
VMRegPair dst,
bool is_receiver,
int* receiver_offset);
-#endif // _LP64
// Support for VM calls
//
@@ -291,8 +282,8 @@ class MacroAssembler: public Assembler {
Register arg_1, Register arg_2, Register arg_3,
bool check_exceptions = true);
- void get_vm_result (Register oop_result, Register thread);
- void get_vm_result_2(Register metadata_result, Register thread);
+ void get_vm_result_oop(Register oop_result);
+ void get_vm_result_metadata(Register metadata_result);
// These always tightly bind to MacroAssembler::call_VM_base
// bypassing the virtual implementation
@@ -323,35 +314,22 @@ class MacroAssembler: public Assembler {
void super_call_VM_leaf(address entry_point, Register arg_1, Register arg_2, Register arg_3);
void super_call_VM_leaf(address entry_point, Register arg_1, Register arg_2, Register arg_3, Register arg_4);
- // last Java Frame (fills frame anchor)
- void set_last_Java_frame(Register thread,
- Register last_java_sp,
- Register last_java_fp,
- address last_java_pc,
- Register rscratch);
-
- // thread in the default location (r15_thread on 64bit)
void set_last_Java_frame(Register last_java_sp,
Register last_java_fp,
address last_java_pc,
Register rscratch);
-#ifdef _LP64
void set_last_Java_frame(Register last_java_sp,
Register last_java_fp,
Label &last_java_pc,
Register scratch);
-#endif
-
- void reset_last_Java_frame(Register thread, bool clear_fp);
- // thread in the default location (r15_thread on 64bit)
void reset_last_Java_frame(bool clear_fp);
// jobjects
void clear_jobject_tag(Register possibly_non_local);
- void resolve_jobject(Register value, Register thread, Register tmp);
- void resolve_global_jobject(Register value, Register thread, Register tmp);
+ void resolve_jobject(Register value, Register tmp);
+ void resolve_global_jobject(Register value, Register tmp);
// C 'boolean' to Java boolean: x == 0 ? 0 : 1
void c2bool(Register x);
@@ -371,9 +349,7 @@ class MacroAssembler: public Assembler {
void load_method_holder(Register holder, Register method);
// oop manipulations
-#ifdef _LP64
void load_narrow_klass_compact(Register dst, Register src);
-#endif
void load_klass(Register dst, Register src, Register tmp);
void store_klass(Register dst, Register src, Register tmp);
@@ -386,14 +362,12 @@ class MacroAssembler: public Assembler {
void cmp_klasses_from_objects(Register obj1, Register obj2, Register tmp1, Register tmp2);
void access_load_at(BasicType type, DecoratorSet decorators, Register dst, Address src,
- Register tmp1, Register thread_tmp);
+ Register tmp1);
void access_store_at(BasicType type, DecoratorSet decorators, Address dst, Register val,
Register tmp1, Register tmp2, Register tmp3);
- void load_heap_oop(Register dst, Address src, Register tmp1 = noreg,
- Register thread_tmp = noreg, DecoratorSet decorators = 0);
- void load_heap_oop_not_null(Register dst, Address src, Register tmp1 = noreg,
- Register thread_tmp = noreg, DecoratorSet decorators = 0);
+ void load_heap_oop(Register dst, Address src, Register tmp1 = noreg, DecoratorSet decorators = 0);
+ void load_heap_oop_not_null(Register dst, Address src, Register tmp1 = noreg, DecoratorSet decorators = 0);
void store_heap_oop(Address dst, Register val, Register tmp1 = noreg,
Register tmp2 = noreg, Register tmp3 = noreg, DecoratorSet decorators = 0);
@@ -401,7 +375,6 @@ class MacroAssembler: public Assembler {
// stored using routines that take a jobject.
void store_heap_oop_null(Address dst);
-#ifdef _LP64
void store_klass_gap(Register dst, Register src);
// This dummy is to prevent a call to store_heap_oop from
@@ -436,8 +409,6 @@ class MacroAssembler: public Assembler {
DEBUG_ONLY(void verify_heapbase(const char* msg);)
-#endif // _LP64
-
// Int division/remainder for Java
// (as idivl, but checks for special case as described in JVM spec.)
// returns idivl instruction offset for implicit exception handling
@@ -477,39 +448,6 @@ class MacroAssembler: public Assembler {
// Division by power of 2, rounding towards 0
void division_with_shift(Register reg, int shift_value);
-#ifndef _LP64
- // Compares the top-most stack entries on the FPU stack and sets the eflags as follows:
- //
- // CF (corresponds to C0) if x < y
- // PF (corresponds to C2) if unordered
- // ZF (corresponds to C3) if x = y
- //
- // The arguments are in reversed order on the stack (i.e., top of stack is first argument).
- // tmp is a temporary register, if none is available use noreg (only matters for non-P6 code)
- void fcmp(Register tmp);
- // Variant of the above which allows y to be further down the stack
- // and which only pops x and y if specified. If pop_right is
- // specified then pop_left must also be specified.
- void fcmp(Register tmp, int index, bool pop_left, bool pop_right);
-
- // Floating-point comparison for Java
- // Compares the top-most stack entries on the FPU stack and stores the result in dst.
- // The arguments are in reversed order on the stack (i.e., top of stack is first argument).
- // (semantics as described in JVM spec.)
- void fcmp2int(Register dst, bool unordered_is_less);
- // Variant of the above which allows y to be further down the stack
- // and which only pops x and y if specified. If pop_right is
- // specified then pop_left must also be specified.
- void fcmp2int(Register dst, bool unordered_is_less, int index, bool pop_left, bool pop_right);
-
- // Floating-point remainder for Java (ST0 = ST0 fremr ST1, ST1 is empty afterwards)
- // tmp is a temporary register, if none is available use noreg
- void fremr(Register tmp);
-
- // only if +VerifyFPU
- void verify_FPU(int stack_depth, const char* s = "illegal FPU state");
-#endif // !LP64
-
// dst = c = a * b + c
void fmad(XMMRegister dst, XMMRegister a, XMMRegister b, XMMRegister c);
void fmaf(XMMRegister dst, XMMRegister a, XMMRegister b, XMMRegister c);
@@ -524,34 +462,6 @@ class MacroAssembler: public Assembler {
void cmpss2int(XMMRegister opr1, XMMRegister opr2, Register dst, bool unordered_is_less);
void cmpsd2int(XMMRegister opr1, XMMRegister opr2, Register dst, bool unordered_is_less);
- // branch to L if FPU flag C2 is set/not set
- // tmp is a temporary register, if none is available use noreg
- void jC2 (Register tmp, Label& L);
- void jnC2(Register tmp, Label& L);
-
- // Load float value from 'address'. If UseSSE >= 1, the value is loaded into
- // register xmm0. Otherwise, the value is loaded onto the FPU stack.
- void load_float(Address src);
-
- // Store float value to 'address'. If UseSSE >= 1, the value is stored
- // from register xmm0. Otherwise, the value is stored from the FPU stack.
- void store_float(Address dst);
-
- // Load double value from 'address'. If UseSSE >= 2, the value is loaded into
- // register xmm0. Otherwise, the value is loaded onto the FPU stack.
- void load_double(Address src);
-
- // Store double value to 'address'. If UseSSE >= 2, the value is stored
- // from register xmm0. Otherwise, the value is stored from the FPU stack.
- void store_double(Address dst);
-
-#ifndef _LP64
- // Pop ST (ffree & fincstp combined)
- void fpop();
-
- void empty_FPU_stack();
-#endif // !_LP64
-
void push_IU_state();
void pop_IU_state();
@@ -603,7 +513,6 @@ class MacroAssembler: public Assembler {
// allocation
void tlab_allocate(
- Register thread, // Current thread
Register obj, // result: pointer to object after successful allocation
Register var_size_in_bytes, // object size in bytes if unknown at compile time; invalid otherwise
int con_size_in_bytes, // object size in bytes if known at compile time
@@ -666,7 +575,6 @@ class MacroAssembler: public Assembler {
Label* L_failure,
bool set_cond_codes = false);
-#ifdef _LP64
// The 64-bit version, which may do a hashed subclass lookup.
void check_klass_subtype_slow_path(Register sub_klass,
Register super_klass,
@@ -676,7 +584,6 @@ class MacroAssembler: public Assembler {
Register temp4_reg,
Label* L_success,
Label* L_failure);
-#endif
// Three parts of a hashed subclass lookup: a simple linear search,
// a table lookup, and a fallback that does linear probing in the
@@ -713,7 +620,6 @@ class MacroAssembler: public Assembler {
Register result,
u1 super_klass_slot);
-#ifdef _LP64
using Assembler::salq;
void salq(Register dest, Register count);
using Assembler::rorq;
@@ -741,7 +647,6 @@ class MacroAssembler: public Assembler {
Register temp1,
Register temp2,
Register temp3);
-#endif
void repne_scanq(Register addr, Register value, Register count, Register limit,
Label* L_success,
@@ -762,7 +667,6 @@ class MacroAssembler: public Assembler {
Label& L_success);
void clinit_barrier(Register klass,
- Register thread,
Label* L_fast_path = nullptr,
Label* L_slow_path = nullptr);
@@ -837,7 +741,7 @@ class MacroAssembler: public Assembler {
// Check for reserved stack access in method being exited (for JIT)
void reserved_stack_check();
- void safepoint_poll(Label& slow_path, Register thread_reg, bool at_return, bool in_nmethod);
+ void safepoint_poll(Label& slow_path, bool at_return, bool in_nmethod);
void verify_tlab();
@@ -851,10 +755,10 @@ class MacroAssembler: public Assembler {
// Arithmetics
- void addptr(Address dst, int32_t src) { LP64_ONLY(addq(dst, src)) NOT_LP64(addl(dst, src)) ; }
+ void addptr(Address dst, int32_t src) { addq(dst, src); }
void addptr(Address dst, Register src);
- void addptr(Register dst, Address src) { LP64_ONLY(addq(dst, src)) NOT_LP64(addl(dst, src)); }
+ void addptr(Register dst, Address src) { addq(dst, src); }
void addptr(Register dst, int32_t src);
void addptr(Register dst, Register src);
void addptr(Register dst, RegisterOrConstant src) {
@@ -863,12 +767,10 @@ class MacroAssembler: public Assembler {
}
void andptr(Register dst, int32_t src);
- void andptr(Register src1, Register src2) { LP64_ONLY(andq(src1, src2)) NOT_LP64(andl(src1, src2)) ; }
+ void andptr(Register src1, Register src2) { andq(src1, src2); }
-#ifdef _LP64
using Assembler::andq;
void andq(Register dst, AddressLiteral src, Register rscratch = noreg);
-#endif
void cmp8(AddressLiteral src1, int imm, Register rscratch = noreg);
@@ -881,12 +783,6 @@ class MacroAssembler: public Assembler {
void cmp32(Register src1, Address src2);
-#ifndef _LP64
- void cmpklass(Address dst, Metadata* obj);
- void cmpklass(Register dst, Metadata* obj);
- void cmpoop(Address dst, jobject obj);
-#endif // _LP64
-
void cmpoop(Register src1, Register src2);
void cmpoop(Register src1, Address src2);
void cmpoop(Register dst, jobject obj, Register rscratch);
@@ -896,12 +792,11 @@ class MacroAssembler: public Assembler {
void cmpptr(Register src1, AddressLiteral src2, Register rscratch = noreg);
- void cmpptr(Register src1, Register src2) { LP64_ONLY(cmpq(src1, src2)) NOT_LP64(cmpl(src1, src2)) ; }
- void cmpptr(Register src1, Address src2) { LP64_ONLY(cmpq(src1, src2)) NOT_LP64(cmpl(src1, src2)) ; }
- // void cmpptr(Address src1, Register src2) { LP64_ONLY(cmpq(src1, src2)) NOT_LP64(cmpl(src1, src2)) ; }
+ void cmpptr(Register src1, Register src2) { cmpq(src1, src2); }
+ void cmpptr(Register src1, Address src2) { cmpq(src1, src2); }
- void cmpptr(Register src1, int32_t src2) { LP64_ONLY(cmpq(src1, src2)) NOT_LP64(cmpl(src1, src2)) ; }
- void cmpptr(Address src1, int32_t src2) { LP64_ONLY(cmpq(src1, src2)) NOT_LP64(cmpl(src1, src2)) ; }
+ void cmpptr(Register src1, int32_t src2) { cmpq(src1, src2); }
+ void cmpptr(Address src1, int32_t src2) { cmpq(src1, src2); }
// cmp64 to avoild hiding cmpq
void cmp64(Register src1, AddressLiteral src, Register rscratch = noreg);
@@ -910,26 +805,26 @@ class MacroAssembler: public Assembler {
void locked_cmpxchgptr(Register reg, AddressLiteral adr, Register rscratch = noreg);
- void imulptr(Register dst, Register src) { LP64_ONLY(imulq(dst, src)) NOT_LP64(imull(dst, src)); }
- void imulptr(Register dst, Register src, int imm32) { LP64_ONLY(imulq(dst, src, imm32)) NOT_LP64(imull(dst, src, imm32)); }
+ void imulptr(Register dst, Register src) { imulq(dst, src); }
+ void imulptr(Register dst, Register src, int imm32) { imulq(dst, src, imm32); }
- void negptr(Register dst) { LP64_ONLY(negq(dst)) NOT_LP64(negl(dst)); }
+ void negptr(Register dst) { negq(dst); }
- void notptr(Register dst) { LP64_ONLY(notq(dst)) NOT_LP64(notl(dst)); }
+ void notptr(Register dst) { notq(dst); }
void shlptr(Register dst, int32_t shift);
- void shlptr(Register dst) { LP64_ONLY(shlq(dst)) NOT_LP64(shll(dst)); }
+ void shlptr(Register dst) { shlq(dst); }
void shrptr(Register dst, int32_t shift);
- void shrptr(Register dst) { LP64_ONLY(shrq(dst)) NOT_LP64(shrl(dst)); }
+ void shrptr(Register dst) { shrq(dst); }
- void sarptr(Register dst) { LP64_ONLY(sarq(dst)) NOT_LP64(sarl(dst)); }
- void sarptr(Register dst, int32_t src) { LP64_ONLY(sarq(dst, src)) NOT_LP64(sarl(dst, src)); }
+ void sarptr(Register dst) { sarq(dst); }
+ void sarptr(Register dst, int32_t src) { sarq(dst, src); }
- void subptr(Address dst, int32_t src) { LP64_ONLY(subq(dst, src)) NOT_LP64(subl(dst, src)); }
+ void subptr(Address dst, int32_t src) { subq(dst, src); }
- void subptr(Register dst, Address src) { LP64_ONLY(subq(dst, src)) NOT_LP64(subl(dst, src)); }
+ void subptr(Register dst, Address src) { subq(dst, src); }
void subptr(Register dst, int32_t src);
// Force generation of a 4 byte immediate value even if it fits into 8bit
void subptr_imm32(Register dst, int32_t src);
@@ -939,13 +834,13 @@ class MacroAssembler: public Assembler {
else subptr(dst, src.as_register());
}
- void sbbptr(Address dst, int32_t src) { LP64_ONLY(sbbq(dst, src)) NOT_LP64(sbbl(dst, src)); }
- void sbbptr(Register dst, int32_t src) { LP64_ONLY(sbbq(dst, src)) NOT_LP64(sbbl(dst, src)); }
+ void sbbptr(Address dst, int32_t src) { sbbq(dst, src); }
+ void sbbptr(Register dst, int32_t src) { sbbq(dst, src); }
- void xchgptr(Register src1, Register src2) { LP64_ONLY(xchgq(src1, src2)) NOT_LP64(xchgl(src1, src2)) ; }
- void xchgptr(Register src1, Address src2) { LP64_ONLY(xchgq(src1, src2)) NOT_LP64(xchgl(src1, src2)) ; }
+ void xchgptr(Register src1, Register src2) { xchgq(src1, src2); }
+ void xchgptr(Register src1, Address src2) { xchgq(src1, src2); }
- void xaddptr(Address src1, Register src2) { LP64_ONLY(xaddq(src1, src2)) NOT_LP64(xaddl(src1, src2)) ; }
+ void xaddptr(Address src1, Register src2) { xaddq(src1, src2); }
@@ -955,12 +850,10 @@ class MacroAssembler: public Assembler {
// Unconditional atomic increment.
void atomic_incl(Address counter_addr);
void atomic_incl(AddressLiteral counter_addr, Register rscratch = noreg);
-#ifdef _LP64
void atomic_incq(Address counter_addr);
void atomic_incq(AddressLiteral counter_addr, Register rscratch = noreg);
-#endif
- void atomic_incptr(AddressLiteral counter_addr, Register rscratch = noreg) { LP64_ONLY(atomic_incq(counter_addr, rscratch)) NOT_LP64(atomic_incl(counter_addr, rscratch)) ; }
- void atomic_incptr(Address counter_addr) { LP64_ONLY(atomic_incq(counter_addr)) NOT_LP64(atomic_incl(counter_addr)) ; }
+ void atomic_incptr(AddressLiteral counter_addr, Register rscratch = noreg) { atomic_incq(counter_addr, rscratch); }
+ void atomic_incptr(Address counter_addr) { atomic_incq(counter_addr); }
using Assembler::lea;
void lea(Register dst, AddressLiteral adr);
@@ -978,18 +871,18 @@ class MacroAssembler: public Assembler {
void testq(Address dst, int32_t imm32);
void testq(Register dst, int32_t imm32);
- void orptr(Register dst, Address src) { LP64_ONLY(orq(dst, src)) NOT_LP64(orl(dst, src)); }
- void orptr(Register dst, Register src) { LP64_ONLY(orq(dst, src)) NOT_LP64(orl(dst, src)); }
- void orptr(Register dst, int32_t src) { LP64_ONLY(orq(dst, src)) NOT_LP64(orl(dst, src)); }
- void orptr(Address dst, int32_t imm32) { LP64_ONLY(orq(dst, imm32)) NOT_LP64(orl(dst, imm32)); }
+ void orptr(Register dst, Address src) { orq(dst, src); }
+ void orptr(Register dst, Register src) { orq(dst, src); }
+ void orptr(Register dst, int32_t src) { orq(dst, src); }
+ void orptr(Address dst, int32_t imm32) { orq(dst, imm32); }
- void testptr(Register src, int32_t imm32) { LP64_ONLY(testq(src, imm32)) NOT_LP64(testl(src, imm32)); }
- void testptr(Register src1, Address src2) { LP64_ONLY(testq(src1, src2)) NOT_LP64(testl(src1, src2)); }
- void testptr(Address src, int32_t imm32) { LP64_ONLY(testq(src, imm32)) NOT_LP64(testl(src, imm32)); }
+ void testptr(Register src, int32_t imm32) { testq(src, imm32); }
+ void testptr(Register src1, Address src2) { testq(src1, src2); }
+ void testptr(Address src, int32_t imm32) { testq(src, imm32); }
void testptr(Register src1, Register src2);
- void xorptr(Register dst, Register src) { LP64_ONLY(xorq(dst, src)) NOT_LP64(xorl(dst, src)); }
- void xorptr(Register dst, Address src) { LP64_ONLY(xorq(dst, src)) NOT_LP64(xorl(dst, src)); }
+ void xorptr(Register dst, Register src) { xorq(dst, src); }
+ void xorptr(Register dst, Address src) { xorq(dst, src); }
// Calls
@@ -1114,32 +1007,10 @@ class MacroAssembler: public Assembler {
void comisd(XMMRegister dst, Address src) { Assembler::comisd(dst, src); }
void comisd(XMMRegister dst, AddressLiteral src, Register rscratch = noreg);
-#ifndef _LP64
- void fadd_s(Address src) { Assembler::fadd_s(src); }
- void fadd_s(AddressLiteral src) { Assembler::fadd_s(as_Address(src)); }
-
- void fldcw(Address src) { Assembler::fldcw(src); }
- void fldcw(AddressLiteral src);
-
- void fld_s(int index) { Assembler::fld_s(index); }
- void fld_s(Address src) { Assembler::fld_s(src); }
- void fld_s(AddressLiteral src);
-
- void fld_d(Address src) { Assembler::fld_d(src); }
- void fld_d(AddressLiteral src);
-
- void fld_x(Address src) { Assembler::fld_x(src); }
- void fld_x(AddressLiteral src) { Assembler::fld_x(as_Address(src)); }
-
- void fmul_s(Address src) { Assembler::fmul_s(src); }
- void fmul_s(AddressLiteral src) { Assembler::fmul_s(as_Address(src)); }
-#endif // !_LP64
-
void cmp32_mxcsr_std(Address mxcsr_save, Register tmp, Register rscratch = noreg);
void ldmxcsr(Address src) { Assembler::ldmxcsr(src); }
void ldmxcsr(AddressLiteral src, Register rscratch = noreg);
-#ifdef _LP64
private:
void sha256_AVX2_one_round_compute(
Register reg_old_h,
@@ -1189,7 +1060,6 @@ class MacroAssembler: public Assembler {
Register buf, Register state, Register ofs, Register limit, Register rsp, bool multi_block,
XMMRegister shuf_mask);
void sha512_update_ni_x1(Register arg_hash, Register arg_msg, Register ofs, Register limit, bool multi_block);
-#endif // _LP64
void fast_md5(Register buf, Address state, Address ofs, Address limit,
bool multi_block);
@@ -1199,68 +1069,15 @@ class MacroAssembler: public Assembler {
Register buf, Register state, Register ofs, Register limit, Register rsp,
bool multi_block);
-#ifdef _LP64
void fast_sha256(XMMRegister msg, XMMRegister state0, XMMRegister state1, XMMRegister msgtmp0,
XMMRegister msgtmp1, XMMRegister msgtmp2, XMMRegister msgtmp3, XMMRegister msgtmp4,
Register buf, Register state, Register ofs, Register limit, Register rsp,
bool multi_block, XMMRegister shuf_mask);
-#else
- void fast_sha256(XMMRegister msg, XMMRegister state0, XMMRegister state1, XMMRegister msgtmp0,
- XMMRegister msgtmp1, XMMRegister msgtmp2, XMMRegister msgtmp3, XMMRegister msgtmp4,
- Register buf, Register state, Register ofs, Register limit, Register rsp,
- bool multi_block);
-#endif
void fast_exp(XMMRegister xmm0, XMMRegister xmm1, XMMRegister xmm2, XMMRegister xmm3,
XMMRegister xmm4, XMMRegister xmm5, XMMRegister xmm6, XMMRegister xmm7,
Register rax, Register rcx, Register rdx, Register tmp);
-#ifndef _LP64
- private:
- // Initialized in macroAssembler_x86_constants.cpp
- static address ONES;
- static address L_2IL0FLOATPACKET_0;
- static address PI4_INV;
- static address PI4X3;
- static address PI4X4;
-
- public:
- void fast_log(XMMRegister xmm0, XMMRegister xmm1, XMMRegister xmm2, XMMRegister xmm3,
- XMMRegister xmm4, XMMRegister xmm5, XMMRegister xmm6, XMMRegister xmm7,
- Register rax, Register rcx, Register rdx, Register tmp1);
-
- void fast_log10(XMMRegister xmm0, XMMRegister xmm1, XMMRegister xmm2, XMMRegister xmm3,
- XMMRegister xmm4, XMMRegister xmm5, XMMRegister xmm6, XMMRegister xmm7,
- Register rax, Register rcx, Register rdx, Register tmp);
-
- void fast_pow(XMMRegister xmm0, XMMRegister xmm1, XMMRegister xmm2, XMMRegister xmm3, XMMRegister xmm4,
- XMMRegister xmm5, XMMRegister xmm6, XMMRegister xmm7, Register rax, Register rcx,
- Register rdx, Register tmp);
-
- void fast_sin(XMMRegister xmm0, XMMRegister xmm1, XMMRegister xmm2, XMMRegister xmm3,
- XMMRegister xmm4, XMMRegister xmm5, XMMRegister xmm6, XMMRegister xmm7,
- Register rax, Register rbx, Register rdx);
-
- void fast_cos(XMMRegister xmm0, XMMRegister xmm1, XMMRegister xmm2, XMMRegister xmm3,
- XMMRegister xmm4, XMMRegister xmm5, XMMRegister xmm6, XMMRegister xmm7,
- Register rax, Register rcx, Register rdx, Register tmp);
-
- void libm_sincos_huge(XMMRegister xmm0, XMMRegister xmm1, Register eax, Register ecx,
- Register edx, Register ebx, Register esi, Register edi,
- Register ebp, Register esp);
-
- void libm_reduce_pi04l(Register eax, Register ecx, Register edx, Register ebx,
- Register esi, Register edi, Register ebp, Register esp);
-
- void libm_tancot_huge(XMMRegister xmm0, XMMRegister xmm1, Register eax, Register ecx,
- Register edx, Register ebx, Register esi, Register edi,
- Register ebp, Register esp);
-
- void fast_tan(XMMRegister xmm0, XMMRegister xmm1, XMMRegister xmm2, XMMRegister xmm3,
- XMMRegister xmm4, XMMRegister xmm5, XMMRegister xmm6, XMMRegister xmm7,
- Register rax, Register rcx, Register rdx, Register tmp);
-#endif // !_LP64
-
private:
// these are private because users should be doing movflt/movdbl
@@ -2027,8 +1844,8 @@ class MacroAssembler: public Assembler {
void cmov( Condition cc, Register dst, Register src) { cmovptr(cc, dst, src); }
- void cmovptr(Condition cc, Register dst, Address src) { LP64_ONLY(cmovq(cc, dst, src)) NOT_LP64(cmov32(cc, dst, src)); }
- void cmovptr(Condition cc, Register dst, Register src) { LP64_ONLY(cmovq(cc, dst, src)) NOT_LP64(cmov32(cc, dst, src)); }
+ void cmovptr(Condition cc, Register dst, Address src) { cmovq(cc, dst, src); }
+ void cmovptr(Condition cc, Register dst, Register src) { cmovq(cc, dst, src); }
void movoop(Register dst, jobject obj);
void movoop(Address dst, jobject obj, Register rscratch);
@@ -2067,15 +1884,15 @@ class MacroAssembler: public Assembler {
// Can push value or effective address
void pushptr(AddressLiteral src, Register rscratch);
- void pushptr(Address src) { LP64_ONLY(pushq(src)) NOT_LP64(pushl(src)); }
- void popptr(Address src) { LP64_ONLY(popq(src)) NOT_LP64(popl(src)); }
+ void pushptr(Address src) { pushq(src); }
+ void popptr(Address src) { popq(src); }
void pushoop(jobject obj, Register rscratch);
void pushklass(Metadata* obj, Register rscratch);
// sign extend as need a l to ptr sized element
- void movl2ptr(Register dst, Address src) { LP64_ONLY(movslq(dst, src)) NOT_LP64(movl(dst, src)); }
- void movl2ptr(Register dst, Register src) { LP64_ONLY(movslq(dst, src)) NOT_LP64(if (dst != src) movl(dst, src)); }
+ void movl2ptr(Register dst, Address src) { movslq(dst, src); }
+ void movl2ptr(Register dst, Register src) { movslq(dst, src); }
public:
@@ -2098,7 +1915,6 @@ class MacroAssembler: public Assembler {
XMMRegister tmp1, XMMRegister tmp2, XMMRegister tmp3,
XMMRegister tmp4, Register tmp5, Register result, bool ascii);
-#ifdef _LP64
void add2_with_carry(Register dest_hi, Register dest_lo, Register src1, Register src2);
void multiply_64_x_64_loop(Register x, Register xstart, Register x_xstart,
Register y, Register y_idx, Register z,
@@ -2139,32 +1955,22 @@ class MacroAssembler: public Assembler {
void vectorized_mismatch(Register obja, Register objb, Register length, Register log2_array_indxscale,
Register result, Register tmp1, Register tmp2,
XMMRegister vec1, XMMRegister vec2, XMMRegister vec3);
-#endif
// CRC32 code for java.util.zip.CRC32::updateBytes() intrinsic.
void update_byte_crc32(Register crc, Register val, Register table);
void kernel_crc32(Register crc, Register buf, Register len, Register table, Register tmp);
-
-#ifdef _LP64
void kernel_crc32_avx512(Register crc, Register buf, Register len, Register table, Register tmp1, Register tmp2);
void kernel_crc32_avx512_256B(Register crc, Register buf, Register len, Register key, Register pos,
Register tmp1, Register tmp2, Label& L_barrett, Label& L_16B_reduction_loop,
Label& L_get_last_two_xmms, Label& L_128_done, Label& L_cleanup);
-#endif // _LP64
// CRC32C code for java.util.zip.CRC32C::updateBytes() intrinsic
// Note on a naming convention:
// Prefix w = register only used on a Westmere+ architecture
// Prefix n = register only used on a Nehalem architecture
-#ifdef _LP64
void crc32c_ipl_alg4(Register in_out, uint32_t n,
Register tmp1, Register tmp2, Register tmp3);
-#else
- void crc32c_ipl_alg4(Register in_out, uint32_t n,
- Register tmp1, Register tmp2, Register tmp3,
- XMMRegister xtmp1, XMMRegister xtmp2);
-#endif
void crc32c_pclmulqdq(XMMRegister w_xtmp1,
Register in_out,
uint32_t const_or_pre_comp_const_index, bool is_pclmulqdq_supported,
@@ -2189,10 +1995,8 @@ class MacroAssembler: public Assembler {
// Fold 128-bit data chunk
void fold_128bit_crc32(XMMRegister xcrc, XMMRegister xK, XMMRegister xtmp, Register buf, int offset);
void fold_128bit_crc32(XMMRegister xcrc, XMMRegister xK, XMMRegister xtmp, XMMRegister xbuf);
-#ifdef _LP64
// Fold 512-bit data chunk
void fold512bit_crc32_avx512(XMMRegister xcrc, XMMRegister xK, XMMRegister xtmp, Register buf, Register pos, int offset);
-#endif // _LP64
// Fold 8-bit data
void fold_8bit_crc32(Register crc, Register table, Register tmp);
void fold_8bit_crc32(XMMRegister crc, Register table, XMMRegister xtmp, Register tmp);
@@ -2226,7 +2030,6 @@ class MacroAssembler: public Assembler {
void fill64(Register dst, int dis, XMMRegister xmm, bool use64byteVector = false);
-#ifdef _LP64
void convert_f2i(Register dst, XMMRegister src);
void convert_d2i(Register dst, XMMRegister src);
void convert_f2l(Register dst, XMMRegister src);
@@ -2241,20 +2044,17 @@ class MacroAssembler: public Assembler {
void generate_fill_avx3(BasicType type, Register to, Register value,
Register count, Register rtmp, XMMRegister xtmp);
#endif // COMPILER2_OR_JVMCI
-#endif // _LP64
void vallones(XMMRegister dst, int vector_len);
void check_stack_alignment(Register sp, const char* msg, unsigned bias = 0, Register tmp = noreg);
- void lightweight_lock(Register basic_lock, Register obj, Register reg_rax, Register thread, Register tmp, Label& slow);
- void lightweight_unlock(Register obj, Register reg_rax, Register thread, Register tmp, Label& slow);
+ void lightweight_lock(Register basic_lock, Register obj, Register reg_rax, Register tmp, Label& slow);
+ void lightweight_unlock(Register obj, Register reg_rax, Register tmp, Label& slow);
-#ifdef _LP64
void save_legacy_gprs();
void restore_legacy_gprs();
void setcc(Assembler::Condition comparison, Register dst);
-#endif
};
#endif // CPU_X86_MACROASSEMBLER_X86_HPP
diff --git a/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp b/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp
index 5fd6db868cc8b..432f927754904 100644
--- a/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp
+++ b/src/hotspot/cpu/x86/macroAssembler_x86_sha.cpp
@@ -235,17 +235,10 @@ void MacroAssembler::fast_sha1(XMMRegister abcd, XMMRegister e0, XMMRegister e1,
// and state0 and state1 can never use xmm0 register.
// ofs and limit are used for multi-block byte array.
// int com.sun.security.provider.DigestBase.implCompressMultiBlock(byte[] b, int ofs, int limit)
-#ifdef _LP64
void MacroAssembler::fast_sha256(XMMRegister msg, XMMRegister state0, XMMRegister state1, XMMRegister msgtmp0,
XMMRegister msgtmp1, XMMRegister msgtmp2, XMMRegister msgtmp3, XMMRegister msgtmp4,
Register buf, Register state, Register ofs, Register limit, Register rsp,
bool multi_block, XMMRegister shuf_mask) {
-#else
-void MacroAssembler::fast_sha256(XMMRegister msg, XMMRegister state0, XMMRegister state1, XMMRegister msgtmp0,
- XMMRegister msgtmp1, XMMRegister msgtmp2, XMMRegister msgtmp3, XMMRegister msgtmp4,
- Register buf, Register state, Register ofs, Register limit, Register rsp,
- bool multi_block) {
-#endif
Label done_hash, loop0;
address K256 = StubRoutines::x86::k256_addr();
@@ -260,9 +253,7 @@ void MacroAssembler::fast_sha256(XMMRegister msg, XMMRegister state0, XMMRegiste
palignr(state0, state1, 8);
pblendw(state1, msgtmp4, 0xF0);
-#ifdef _LP64
movdqu(shuf_mask, ExternalAddress(pshuffle_byte_flip_mask));
-#endif
lea(rax, ExternalAddress(K256));
bind(loop0);
@@ -271,11 +262,7 @@ void MacroAssembler::fast_sha256(XMMRegister msg, XMMRegister state0, XMMRegiste
// Rounds 0-3
movdqu(msg, Address(buf, 0));
-#ifdef _LP64
pshufb(msg, shuf_mask);
-#else
- pshufb(msg, ExternalAddress(pshuffle_byte_flip_mask));
-#endif
movdqa(msgtmp0, msg);
paddd(msg, Address(rax, 0));
sha256rnds2(state1, state0);
@@ -284,11 +271,7 @@ void MacroAssembler::fast_sha256(XMMRegister msg, XMMRegister state0, XMMRegiste
// Rounds 4-7
movdqu(msg, Address(buf, 16));
-#ifdef _LP64
pshufb(msg, shuf_mask);
-#else
- pshufb(msg, ExternalAddress(pshuffle_byte_flip_mask));
-#endif
movdqa(msgtmp1, msg);
paddd(msg, Address(rax, 16));
sha256rnds2(state1, state0);
@@ -298,11 +281,7 @@ void MacroAssembler::fast_sha256(XMMRegister msg, XMMRegister state0, XMMRegiste
// Rounds 8-11
movdqu(msg, Address(buf, 32));
-#ifdef _LP64
pshufb(msg, shuf_mask);
-#else
- pshufb(msg, ExternalAddress(pshuffle_byte_flip_mask));
-#endif
movdqa(msgtmp2, msg);
paddd(msg, Address(rax, 32));
sha256rnds2(state1, state0);
@@ -312,11 +291,7 @@ void MacroAssembler::fast_sha256(XMMRegister msg, XMMRegister state0, XMMRegiste
// Rounds 12-15
movdqu(msg, Address(buf, 48));
-#ifdef _LP64
pshufb(msg, shuf_mask);
-#else
- pshufb(msg, ExternalAddress(pshuffle_byte_flip_mask));
-#endif
movdqa(msgtmp3, msg);
paddd(msg, Address(rax, 48));
sha256rnds2(state1, state0);
@@ -491,10 +466,9 @@ void MacroAssembler::fast_sha256(XMMRegister msg, XMMRegister state0, XMMRegiste
}
-#ifdef _LP64
/*
The algorithm below is based on Intel publication:
- "Fast SHA-256 Implementations on Intelë Architecture Processors" by Jim Guilford, Kirk Yap and Vinodh Gopal.
+ "Fast SHA-256 Implementations on Intel(R) Architecture Processors" by Jim Guilford, Kirk Yap and Vinodh Gopal.
The assembly code was originally provided by Sean Gulley and in many places preserves
the original assembly NAMES and comments to simplify matching Java assembly with its original.
The Java version was substantially redesigned to replace 1200 assembly instruction with
@@ -1696,6 +1670,3 @@ void MacroAssembler::sha512_update_ni_x1(Register arg_hash, Register arg_msg, Re
bind(done_hash);
}
-
-#endif //#ifdef _LP64
-
diff --git a/src/hotspot/cpu/x86/matcher_x86.hpp b/src/hotspot/cpu/x86/matcher_x86.hpp
index 78591989b5b76..41486c244b247 100644
--- a/src/hotspot/cpu/x86/matcher_x86.hpp
+++ b/src/hotspot/cpu/x86/matcher_x86.hpp
@@ -59,53 +59,34 @@
static constexpr bool isSimpleConstant64(jlong value) {
// Will one (StoreL ConL) be cheaper than two (StoreI ConI)?.
//return value == (int) value; // Cf. storeImmL and immL32.
-
// Probably always true, even if a temp register is required.
-#ifdef _LP64
return true;
-#else
- return false;
-#endif
}
-#ifdef _LP64
// No additional cost for CMOVL.
static constexpr int long_cmove_cost() { return 0; }
-#else
- // Needs 2 CMOV's for longs.
- static constexpr int long_cmove_cost() { return 1; }
-#endif
-#ifdef _LP64
// No CMOVF/CMOVD with SSE2
static int float_cmove_cost() { return ConditionalMoveLimit; }
-#else
- // No CMOVF/CMOVD with SSE/SSE2
- static int float_cmove_cost() { return (UseSSE>=1) ? ConditionalMoveLimit : 0; }
-#endif
static bool narrow_oop_use_complex_address() {
- NOT_LP64(ShouldNotCallThis();)
assert(UseCompressedOops, "only for compressed oops code");
return (LogMinObjAlignmentInBytes <= 3);
}
static bool narrow_klass_use_complex_address() {
- NOT_LP64(ShouldNotCallThis();)
assert(UseCompressedClassPointers, "only for compressed klass code");
return (CompressedKlassPointers::shift() <= 3);
}
// Prefer ConN+DecodeN over ConP.
static bool const_oop_prefer_decode() {
- NOT_LP64(ShouldNotCallThis();)
// Prefer ConN+DecodeN over ConP.
return true;
}
// Prefer ConP over ConNKlass+DecodeNKlass.
static bool const_klass_prefer_decode() {
- NOT_LP64(ShouldNotCallThis();)
return false;
}
@@ -123,24 +104,12 @@
// Are floats converted to double when stored to stack during deoptimization?
// On x64 it is stored without conversion so we can use normal access.
- // On x32 it is stored with conversion only when FPU is used for floats.
-#ifdef _LP64
static constexpr bool float_in_double() {
return false;
}
-#else
- static bool float_in_double() {
- return (UseSSE == 0);
- }
-#endif
// Do ints take an entire long register or just half?
-#ifdef _LP64
static const bool int_in_long = true;
-#else
- static const bool int_in_long = false;
-#endif
-
// Does the CPU supports vector variable shift instructions?
static bool supports_vector_variable_shifts(void) {
diff --git a/src/hotspot/cpu/x86/methodHandles_x86.cpp b/src/hotspot/cpu/x86/methodHandles_x86.cpp
index 0d95af133fa81..f3683e7d09cc2 100644
--- a/src/hotspot/cpu/x86/methodHandles_x86.cpp
+++ b/src/hotspot/cpu/x86/methodHandles_x86.cpp
@@ -82,8 +82,8 @@ void MethodHandles::verify_klass(MacroAssembler* _masm,
__ verify_oop(obj);
__ testptr(obj, obj);
__ jcc(Assembler::zero, L_bad);
-#define PUSH { __ push(temp); LP64_ONLY( __ push(rscratch1); ) }
-#define POP { LP64_ONLY( __ pop(rscratch1); ) __ pop(temp); }
+#define PUSH { __ push(temp); __ push(rscratch1); }
+#define POP { __ pop(rscratch1); __ pop(temp); }
PUSH;
__ load_klass(temp, obj, rscratch1);
__ cmpptr(temp, ExternalAddress((address) klass_addr), rscratch1);
@@ -122,32 +122,73 @@ void MethodHandles::verify_ref_kind(MacroAssembler* _masm, int ref_kind, Registe
__ bind(L);
}
-#endif //ASSERT
+void MethodHandles::verify_method(MacroAssembler* _masm, Register method, Register temp, vmIntrinsics::ID iid) {
+ BLOCK_COMMENT("verify_method {");
+ __ verify_method_ptr(method);
+ if (VerifyMethodHandles) {
+ Label L_ok;
+ assert_different_registers(method, temp);
+
+ const Register method_holder = temp;
+ __ load_method_holder(method_holder, method);
+ __ push(method_holder); // keep holder around for diagnostic purposes
+
+ switch (iid) {
+ case vmIntrinsicID::_invokeBasic:
+ // Require compiled LambdaForm class to be fully initialized.
+ __ cmpb(Address(method_holder, InstanceKlass::init_state_offset()), InstanceKlass::fully_initialized);
+ __ jccb(Assembler::equal, L_ok);
+ break;
+
+ case vmIntrinsicID::_linkToStatic:
+ __ clinit_barrier(method_holder, &L_ok);
+ break;
+
+ case vmIntrinsicID::_linkToVirtual:
+ case vmIntrinsicID::_linkToSpecial:
+ case vmIntrinsicID::_linkToInterface:
+ // Class initialization check is too strong here. Just ensure that initialization has been initiated.
+ __ cmpb(Address(method_holder, InstanceKlass::init_state_offset()), InstanceKlass::being_initialized);
+ __ jcc(Assembler::greaterEqual, L_ok);
+
+ // init_state check failed, but it may be an abstract interface method
+ __ load_unsigned_short(temp, Address(method, Method::access_flags_offset()));
+ __ testl(temp, JVM_ACC_ABSTRACT);
+ __ jccb(Assembler::notZero, L_ok);
+ break;
+
+ default:
+ fatal("unexpected intrinsic %d: %s", vmIntrinsics::as_int(iid), vmIntrinsics::name_at(iid));
+ }
+
+ // clinit check failed for a concrete method
+ __ STOP("Method holder klass is not initialized");
+
+ __ BIND(L_ok);
+ __ pop(method_holder); // restore stack layout
+ }
+ BLOCK_COMMENT("} verify_method");
+}
+#endif // ASSERT
void MethodHandles::jump_from_method_handle(MacroAssembler* _masm, Register method, Register temp,
- bool for_compiler_entry) {
+ bool for_compiler_entry, vmIntrinsics::ID iid) {
assert(method == rbx, "interpreter calling convention");
Label L_no_such_method;
__ testptr(rbx, rbx);
__ jcc(Assembler::zero, L_no_such_method);
- __ verify_method_ptr(method);
+ verify_method(_masm, method, temp, iid);
if (!for_compiler_entry && JvmtiExport::can_post_interpreter_events()) {
Label run_compiled_code;
// JVMTI events, such as single-stepping, are implemented partly by avoiding running
// compiled code in threads for which the event is enabled. Check here for
// interp_only_mode if these events CAN be enabled.
-#ifdef _LP64
- Register rthread = r15_thread;
-#else
- Register rthread = temp;
- __ get_thread(rthread);
-#endif
// interp_only is an int, on little endian it is sufficient to test the byte only
// Is a cmpl faster?
- __ cmpb(Address(rthread, JavaThread::interp_only_mode_offset()), 0);
+ __ cmpb(Address(r15_thread, JavaThread::interp_only_mode_offset()), 0);
__ jccb(Assembler::zero, run_compiled_code);
__ jmp(Address(method, Method::interpreter_entry_offset()));
__ BIND(run_compiled_code);
@@ -182,7 +223,7 @@ void MethodHandles::jump_to_lambda_form(MacroAssembler* _masm,
__ verify_oop(method_temp);
__ access_load_at(T_ADDRESS, IN_HEAP, method_temp,
Address(method_temp, NONZERO(java_lang_invoke_ResolvedMethodName::vmtarget_offset())),
- noreg, noreg);
+ noreg);
if (VerifyMethodHandles && !for_compiler_entry) {
// make sure recv is already on stack
@@ -199,7 +240,7 @@ void MethodHandles::jump_to_lambda_form(MacroAssembler* _masm,
__ BIND(L);
}
- jump_from_method_handle(_masm, method_temp, temp2, for_compiler_entry);
+ jump_from_method_handle(_masm, method_temp, temp2, for_compiler_entry, vmIntrinsics::_invokeBasic);
BLOCK_COMMENT("} jump_to_lambda_form");
}
@@ -212,7 +253,7 @@ void MethodHandles::jump_to_native_invoker(MacroAssembler* _masm, Register nep_r
__ verify_oop(nep_reg);
__ access_load_at(T_ADDRESS, IN_HEAP, temp_target,
Address(nep_reg, NONZERO(jdk_internal_foreign_abi_NativeEntryPoint::downcall_stub_address_offset_in_bytes())),
- noreg, noreg);
+ noreg);
__ jmp(temp_target);
BLOCK_COMMENT("} jump_to_native_invoker");
@@ -324,7 +365,6 @@ void MethodHandles::generate_method_handle_dispatch(MacroAssembler* _masm,
assert(is_signature_polymorphic(iid), "expected invoke iid");
Register rbx_method = rbx; // eventual target of this invocation
// temps used in this code are not used in *either* compiled or interpreted calling sequences
-#ifdef _LP64
Register temp1 = rscratch1;
Register temp2 = rscratch2;
Register temp3 = rax;
@@ -333,19 +373,7 @@ void MethodHandles::generate_method_handle_dispatch(MacroAssembler* _masm,
assert_different_registers(temp1, j_rarg0, j_rarg1, j_rarg2, j_rarg3, j_rarg4, j_rarg5);
assert_different_registers(temp2, j_rarg0, j_rarg1, j_rarg2, j_rarg3, j_rarg4, j_rarg5);
assert_different_registers(temp3, j_rarg0, j_rarg1, j_rarg2, j_rarg3, j_rarg4, j_rarg5);
- }
-#else
- Register temp1 = (for_compiler_entry ? rsi : rdx);
- Register temp2 = rdi;
- Register temp3 = rax;
- if (for_compiler_entry) {
- assert(receiver_reg == (iid == vmIntrinsics::_linkToStatic || iid == vmIntrinsics::_linkToNative ? noreg : rcx), "only valid assignment");
- assert_different_registers(temp1, rcx, rdx);
- assert_different_registers(temp2, rcx, rdx);
- assert_different_registers(temp3, rcx, rdx);
- }
-#endif
- else {
+ } else {
assert_different_registers(temp1, temp2, temp3, saved_last_sp_register()); // don't trash lastSP
}
assert_different_registers(temp1, temp2, temp3, receiver_reg);
@@ -420,7 +448,7 @@ void MethodHandles::generate_method_handle_dispatch(MacroAssembler* _masm,
verify_ref_kind(_masm, JVM_REF_invokeSpecial, member_reg, temp3);
}
__ load_heap_oop(rbx_method, member_vmtarget);
- __ access_load_at(T_ADDRESS, IN_HEAP, rbx_method, vmtarget_method, noreg, noreg);
+ __ access_load_at(T_ADDRESS, IN_HEAP, rbx_method, vmtarget_method, noreg);
break;
case vmIntrinsics::_linkToStatic:
@@ -428,7 +456,7 @@ void MethodHandles::generate_method_handle_dispatch(MacroAssembler* _masm,
verify_ref_kind(_masm, JVM_REF_invokeStatic, member_reg, temp3);
}
__ load_heap_oop(rbx_method, member_vmtarget);
- __ access_load_at(T_ADDRESS, IN_HEAP, rbx_method, vmtarget_method, noreg, noreg);
+ __ access_load_at(T_ADDRESS, IN_HEAP, rbx_method, vmtarget_method, noreg);
break;
case vmIntrinsics::_linkToVirtual:
@@ -442,7 +470,7 @@ void MethodHandles::generate_method_handle_dispatch(MacroAssembler* _masm,
// pick out the vtable index from the MemberName, and then we can discard it:
Register temp2_index = temp2;
- __ access_load_at(T_ADDRESS, IN_HEAP, temp2_index, member_vmindex, noreg, noreg);
+ __ access_load_at(T_ADDRESS, IN_HEAP, temp2_index, member_vmindex, noreg);
if (VerifyMethodHandles) {
Label L_index_ok;
@@ -474,7 +502,7 @@ void MethodHandles::generate_method_handle_dispatch(MacroAssembler* _masm,
__ verify_klass_ptr(temp3_intf);
Register rbx_index = rbx_method;
- __ access_load_at(T_ADDRESS, IN_HEAP, rbx_index, member_vmindex, noreg, noreg);
+ __ access_load_at(T_ADDRESS, IN_HEAP, rbx_index, member_vmindex, noreg);
if (VerifyMethodHandles) {
Label L;
__ cmpl(rbx_index, 0);
@@ -504,8 +532,7 @@ void MethodHandles::generate_method_handle_dispatch(MacroAssembler* _masm,
// After figuring out which concrete method to call, jump into it.
// Note that this works in the interpreter with no data motion.
// But the compiled version will require that rcx_recv be shifted out.
- __ verify_method_ptr(rbx_method);
- jump_from_method_handle(_masm, rbx_method, temp1, for_compiler_entry);
+ jump_from_method_handle(_masm, rbx_method, temp1, for_compiler_entry, iid);
if (iid == vmIntrinsics::_linkToInterface) {
__ bind(L_incompatible_class_change_error);
@@ -651,17 +678,7 @@ void MethodHandles::trace_method_handle(MacroAssembler* _masm, const char* adapt
// save FP result, valid at some call sites (adapter_opt_return_float, ...)
__ decrement(rsp, 2 * wordSize);
-#ifdef _LP64
__ movdbl(Address(rsp, 0), xmm0);
-#else
- if (UseSSE >= 2) {
- __ movdbl(Address(rsp, 0), xmm0);
- } else if (UseSSE == 1) {
- __ movflt(Address(rsp, 0), xmm0);
- } else {
- __ fst_d(Address(rsp, 0));
- }
-#endif // LP64
// Incoming state:
// rcx: method handle
@@ -676,17 +693,7 @@ void MethodHandles::trace_method_handle(MacroAssembler* _masm, const char* adapt
__ super_call_VM_leaf(CAST_FROM_FN_PTR(address, trace_method_handle_stub_wrapper), rsp);
__ increment(rsp, sizeof(MethodHandleStubArguments));
-#ifdef _LP64
__ movdbl(xmm0, Address(rsp, 0));
-#else
- if (UseSSE >= 2) {
- __ movdbl(xmm0, Address(rsp, 0));
- } else if (UseSSE == 1) {
- __ movflt(xmm0, Address(rsp, 0));
- } else {
- __ fld_d(Address(rsp, 0));
- }
-#endif // LP64
__ increment(rsp, 2 * wordSize);
__ popa();
diff --git a/src/hotspot/cpu/x86/methodHandles_x86.hpp b/src/hotspot/cpu/x86/methodHandles_x86.hpp
index 6574fec66017a..6ba9b5f6a4fa4 100644
--- a/src/hotspot/cpu/x86/methodHandles_x86.hpp
+++ b/src/hotspot/cpu/x86/methodHandles_x86.hpp
@@ -38,6 +38,8 @@ enum /* platform_dependent_constants */ {
Register obj, vmClassID klass_id,
const char* error_message = "wrong klass") NOT_DEBUG_RETURN;
+ static void verify_method(MacroAssembler* _masm, Register method, Register temp, vmIntrinsics::ID iid) NOT_DEBUG_RETURN;
+
static void verify_method_handle(MacroAssembler* _masm, Register mh_reg) {
verify_klass(_masm, mh_reg, VM_CLASS_ID(MethodHandle_klass),
"reference is a MH");
@@ -48,7 +50,7 @@ enum /* platform_dependent_constants */ {
// Similar to InterpreterMacroAssembler::jump_from_interpreted.
// Takes care of special dispatch from single stepping too.
static void jump_from_method_handle(MacroAssembler* _masm, Register method, Register temp,
- bool for_compiler_entry);
+ bool for_compiler_entry, vmIntrinsics::ID iid);
static void jump_to_lambda_form(MacroAssembler* _masm,
Register recv, Register method_temp,
@@ -60,5 +62,5 @@ enum /* platform_dependent_constants */ {
static Register saved_last_sp_register() {
// Should be in sharedRuntime, not here.
- return LP64_ONLY(r13) NOT_LP64(rsi);
+ return r13;
}
diff --git a/src/hotspot/cpu/x86/nativeInst_x86.cpp b/src/hotspot/cpu/x86/nativeInst_x86.cpp
index 4ee741077dc06..c3345be2172f1 100644
--- a/src/hotspot/cpu/x86/nativeInst_x86.cpp
+++ b/src/hotspot/cpu/x86/nativeInst_x86.cpp
@@ -67,9 +67,7 @@ void NativeCall::print() {
// Inserts a native call instruction at a given pc
void NativeCall::insert(address code_pos, address entry) {
intptr_t disp = (intptr_t)entry - ((intptr_t)code_pos + 1 + 4);
-#ifdef AMD64
guarantee(disp == (intptr_t)(jint)disp, "must be 32-bit offset");
-#endif // AMD64
*code_pos = instruction_code;
*((int32_t *)(code_pos+1)) = (int32_t) disp;
ICache::invalidate_range(code_pos, instruction_size);
@@ -140,7 +138,7 @@ bool NativeCall::is_displacement_aligned() {
// Used in the runtime linkage of calls; see class CompiledIC.
// (Cf. 4506997 and 4479829, where threads witnessed garbage displacements.)
void NativeCall::set_destination_mt_safe(address dest) {
- debug_only(verify());
+ DEBUG_ONLY(verify());
// Make sure patching code is locked. No two threads can patch at the same
// time but one may be executing this code.
assert(CodeCache_lock->is_locked() || SafepointSynchronize::is_at_safepoint() ||
@@ -157,7 +155,6 @@ void NativeCall::set_destination_mt_safe(address dest) {
void NativeMovConstReg::verify() {
-#ifdef AMD64
// make sure code pattern is actually a mov reg64, imm64 instruction
bool valid_rex_prefix = ubyte_at(0) == Assembler::REX_W || ubyte_at(0) == Assembler::REX_WB;
bool valid_rex2_prefix = ubyte_at(0) == Assembler::REX2 &&
@@ -169,12 +166,6 @@ void NativeMovConstReg::verify() {
print();
fatal("not a REX.W[B] mov reg64, imm64");
}
-#else
- // make sure code pattern is actually a mov reg, imm32 instruction
- u_char test_byte = *(u_char*)instruction_address();
- u_char test_byte_2 = test_byte & ( 0xff ^ register_mask);
- if (test_byte_2 != instruction_code) fatal("not a mov reg, imm32");
-#endif // AMD64
}
@@ -192,12 +183,10 @@ int NativeMovRegMem::instruction_start() const {
// See comment in Assembler::locate_operand() about VEX prefixes.
if (instr_0 == instruction_VEX_prefix_2bytes) {
assert((UseAVX > 0), "shouldn't have VEX prefix");
- NOT_LP64(assert((0xC0 & ubyte_at(1)) == 0xC0, "shouldn't have LDS and LES instructions"));
return 2;
}
if (instr_0 == instruction_VEX_prefix_3bytes) {
assert((UseAVX > 0), "shouldn't have VEX prefix");
- NOT_LP64(assert((0xC0 & ubyte_at(1)) == 0xC0, "shouldn't have LDS and LES instructions"));
return 3;
}
if (instr_0 == instruction_EVEX_prefix_4bytes) {
@@ -313,8 +302,7 @@ void NativeMovRegMem::print() {
void NativeLoadAddress::verify() {
// make sure code pattern is actually a mov [reg+offset], reg instruction
u_char test_byte = *(u_char*)instruction_address();
- if ( ! ((test_byte == lea_instruction_code)
- LP64_ONLY(|| (test_byte == mov64_instruction_code) ))) {
+ if ((test_byte != lea_instruction_code) && (test_byte != mov64_instruction_code)) {
fatal ("not a lea reg, [reg+offs] instruction");
}
}
@@ -340,9 +328,7 @@ void NativeJump::verify() {
void NativeJump::insert(address code_pos, address entry) {
intptr_t disp = (intptr_t)entry - ((intptr_t)code_pos + 1 + 4);
-#ifdef AMD64
guarantee(disp == (intptr_t)(int32_t)disp, "must be 32-bit offset");
-#endif // AMD64
*code_pos = instruction_code;
*((int32_t*)(code_pos + 1)) = (int32_t)disp;
@@ -355,11 +341,7 @@ void NativeJump::check_verified_entry_alignment(address entry, address verified_
// in use. The patching in that instance must happen only when certain
// alignment restrictions are true. These guarantees check those
// conditions.
-#ifdef AMD64
const int linesize = 64;
-#else
- const int linesize = 32;
-#endif // AMD64
// Must be wordSize aligned
guarantee(((uintptr_t) verified_entry & (wordSize -1)) == 0,
@@ -386,7 +368,6 @@ void NativeJump::check_verified_entry_alignment(address entry, address verified_
//
void NativeJump::patch_verified_entry(address entry, address verified_entry, address dest) {
// complete jump instruction (to be inserted) is in code_buffer;
-#ifdef _LP64
union {
jlong cb_long;
unsigned char code_buffer[8];
@@ -402,43 +383,6 @@ void NativeJump::patch_verified_entry(address entry, address verified_entry, add
Atomic::store((jlong *) verified_entry, u.cb_long);
ICache::invalidate_range(verified_entry, 8);
-
-#else
- unsigned char code_buffer[5];
- code_buffer[0] = instruction_code;
- intptr_t disp = (intptr_t)dest - ((intptr_t)verified_entry + 1 + 4);
- *(int32_t*)(code_buffer + 1) = (int32_t)disp;
-
- check_verified_entry_alignment(entry, verified_entry);
-
- // Can't call nativeJump_at() because it's asserts jump exists
- NativeJump* n_jump = (NativeJump*) verified_entry;
-
- //First patch dummy jmp in place
-
- unsigned char patch[4];
- assert(sizeof(patch)==sizeof(int32_t), "sanity check");
- patch[0] = 0xEB; // jmp rel8
- patch[1] = 0xFE; // jmp to self
- patch[2] = 0xEB;
- patch[3] = 0xFE;
-
- // First patch dummy jmp in place
- *(int32_t*)verified_entry = *(int32_t *)patch;
-
- n_jump->wrote(0);
-
- // Patch 5th byte (from jump instruction)
- verified_entry[4] = code_buffer[4];
-
- n_jump->wrote(4);
-
- // Patch bytes 0-3 (from jump instruction)
- *(int32_t*)verified_entry = *(int32_t *)code_buffer;
- // Invalidate. Opteron requires a flush after every write.
- n_jump->wrote(0);
-#endif // _LP64
-
}
void NativeIllegalInstruction::insert(address code_pos) {
@@ -455,9 +399,7 @@ void NativeGeneralJump::verify() {
void NativeGeneralJump::insert_unconditional(address code_pos, address entry) {
intptr_t disp = (intptr_t)entry - ((intptr_t)code_pos + 1 + 4);
-#ifdef AMD64
guarantee(disp == (intptr_t)(int32_t)disp, "must be 32-bit offset");
-#endif // AMD64
*code_pos = unconditional_long_jump;
*((int32_t *)(code_pos+1)) = (int32_t) disp;
diff --git a/src/hotspot/cpu/x86/nativeInst_x86.hpp b/src/hotspot/cpu/x86/nativeInst_x86.hpp
index d02387aa9ffbb..b2448cb99fdb0 100644
--- a/src/hotspot/cpu/x86/nativeInst_x86.hpp
+++ b/src/hotspot/cpu/x86/nativeInst_x86.hpp
@@ -126,10 +126,8 @@ class NativeCall: public NativeInstruction {
address return_address() const { return addr_at(return_address_offset); }
address destination() const;
void set_destination(address dest) {
-#ifdef AMD64
intptr_t disp = dest - return_address();
guarantee(disp == (intptr_t)(jint)disp, "must be 32-bit offset");
-#endif // AMD64
set_int_at(displacement_offset, (int)(dest - return_address()));
}
// Returns whether the 4-byte displacement operand is 4-byte aligned.
@@ -211,15 +209,9 @@ class NativeCallReg: public NativeInstruction {
// Instruction format for implied addressing mode immediate operand move to register instruction:
// [REX/REX2] [OPCODE] [IMM32]
class NativeMovConstReg: public NativeInstruction {
-#ifdef AMD64
static const bool has_rex = true;
static const int rex_size = 1;
static const int rex2_size = 2;
-#else
- static const bool has_rex = false;
- static const int rex_size = 0;
- static const int rex2_size = 0;
-#endif // AMD64
public:
enum Intel_specific_constants {
instruction_code = 0xB8,
@@ -390,13 +382,8 @@ inline NativeMovRegMem* nativeMovRegMem_at (address address) {
// leal reg, [reg + offset]
class NativeLoadAddress: public NativeMovRegMem {
-#ifdef AMD64
static const bool has_rex = true;
static const int rex_size = 1;
-#else
- static const bool has_rex = false;
- static const int rex_size = 0;
-#endif // AMD64
public:
enum Intel_specific_constants {
instruction_prefix_wide = Assembler::REX_W,
@@ -447,9 +434,7 @@ class NativeJump: public NativeInstruction {
if (dest == (address) -1) {
val = -5; // jump to self
}
-#ifdef AMD64
assert((labs(val) & 0xFFFFFFFF00000000) == 0 || dest == (address)-1, "must be 32bit offset or -1");
-#endif // AMD64
set_int_at(data_offset, (jint)val);
}
@@ -503,7 +488,7 @@ class NativeGeneralJump: public NativeInstruction {
inline NativeGeneralJump* nativeGeneralJump_at(address address) {
NativeGeneralJump* jump = (NativeGeneralJump*)(address);
- debug_only(jump->verify();)
+ DEBUG_ONLY(jump->verify();)
return jump;
}
@@ -572,19 +557,14 @@ inline bool NativeInstruction::is_jump_reg() {
inline bool NativeInstruction::is_cond_jump() { return (int_at(0) & 0xF0FF) == 0x800F /* long jump */ ||
(ubyte_at(0) & 0xF0) == 0x70; /* short jump */ }
inline bool NativeInstruction::is_safepoint_poll() {
-#ifdef AMD64
const bool has_rex_prefix = ubyte_at(0) == NativeTstRegMem::instruction_rex_b_prefix;
const int test_offset = has_rex2_prefix() ? 2 : (has_rex_prefix ? 1 : 0);
-#else
- const int test_offset = 0;
-#endif
const bool is_test_opcode = ubyte_at(test_offset) == NativeTstRegMem::instruction_code_memXregl;
const bool is_rax_target = (ubyte_at(test_offset + 1) & NativeTstRegMem::modrm_mask) == NativeTstRegMem::modrm_reg;
return is_test_opcode && is_rax_target;
}
inline bool NativeInstruction::is_mov_literal64() {
-#ifdef AMD64
bool valid_rex_prefix = ubyte_at(0) == Assembler::REX_W || ubyte_at(0) == Assembler::REX_WB;
bool valid_rex2_prefix = ubyte_at(0) == Assembler::REX2 &&
(ubyte_at(1) == Assembler::REX2BIT_W ||
@@ -593,9 +573,6 @@ inline bool NativeInstruction::is_mov_literal64() {
int opcode = has_rex2_prefix() ? ubyte_at(2) : ubyte_at(1);
return ((valid_rex_prefix || valid_rex2_prefix) && (opcode & (0xff ^ NativeMovConstReg::register_mask)) == 0xB8);
-#else
- return false;
-#endif // AMD64
}
class NativePostCallNop: public NativeInstruction {
diff --git a/src/hotspot/cpu/x86/runtime_x86_64.cpp b/src/hotspot/cpu/x86/runtime_x86_64.cpp
index a063c7aeb37a9..027a523b33d72 100644
--- a/src/hotspot/cpu/x86/runtime_x86_64.cpp
+++ b/src/hotspot/cpu/x86/runtime_x86_64.cpp
@@ -61,6 +61,9 @@ UncommonTrapBlob* OptoRuntime::generate_uncommon_trap_blob() {
// Setup code generation tools
const char* name = OptoRuntime::stub_name(OptoStubId::uncommon_trap_id);
CodeBuffer buffer(name, 2048, 1024);
+ if (buffer.blob() == nullptr) {
+ return nullptr;
+ }
MacroAssembler* masm = new MacroAssembler(&buffer);
assert(SimpleRuntimeFrame::framesize % 4 == 0, "sp not 16-byte aligned");
@@ -267,6 +270,9 @@ ExceptionBlob* OptoRuntime::generate_exception_blob() {
// Setup code generation tools
const char* name = OptoRuntime::stub_name(OptoStubId::exception_id);
CodeBuffer buffer(name, 2048, 1024);
+ if (buffer.blob() == nullptr) {
+ return nullptr;
+ }
MacroAssembler* masm = new MacroAssembler(&buffer);
diff --git a/src/hotspot/cpu/x86/sharedRuntime_x86.cpp b/src/hotspot/cpu/x86/sharedRuntime_x86.cpp
index 0a277a4eb69f6..b8a4b82915921 100644
--- a/src/hotspot/cpu/x86/sharedRuntime_x86.cpp
+++ b/src/hotspot/cpu/x86/sharedRuntime_x86.cpp
@@ -73,21 +73,14 @@ void SharedRuntime::inline_check_hashcode_from_object_header(MacroAssembler* mas
}
// get hash
-#ifdef _LP64
// Read the header and build a mask to get its hash field.
// Depend on hash_mask being at most 32 bits and avoid the use of hash_mask_in_place
// because it could be larger than 32 bits in a 64-bit vm. See markWord.hpp.
__ shrptr(result, markWord::hash_shift);
__ andptr(result, markWord::hash_mask);
-#else
- __ andptr(result, markWord::hash_mask_in_place);
-#endif //_LP64
// test if hashCode exists
- __ jcc(Assembler::zero, slowCase);
-#ifndef _LP64
- __ shrptr(result, markWord::hash_shift);
-#endif
+ __ jccb(Assembler::zero, slowCase);
__ ret(0);
__ bind(slowCase);
}
diff --git a/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp b/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp
index d3e7e23678ae7..7811d59d12d11 100644
--- a/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp
+++ b/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp
@@ -675,7 +675,6 @@ static void patch_callers_callsite(MacroAssembler *masm) {
__ bind(L);
}
-
static void gen_c2i_adapter(MacroAssembler *masm,
int total_args_passed,
int comp_args_on_stack,
@@ -826,19 +825,6 @@ static void gen_c2i_adapter(MacroAssembler *masm,
__ jmp(rcx);
}
-static void range_check(MacroAssembler* masm, Register pc_reg, Register temp_reg,
- address code_start, address code_end,
- Label& L_ok) {
- Label L_fail;
- __ lea(temp_reg, AddressLiteral(code_start, relocInfo::none));
- __ cmpptr(pc_reg, temp_reg);
- __ jcc(Assembler::belowEqual, L_fail);
- __ lea(temp_reg, AddressLiteral(code_end, relocInfo::none));
- __ cmpptr(pc_reg, temp_reg);
- __ jcc(Assembler::below, L_ok);
- __ bind(L_fail);
-}
-
void SharedRuntime::gen_i2c_adapter(MacroAssembler *masm,
int total_args_passed,
int comp_args_on_stack,
@@ -871,41 +857,6 @@ void SharedRuntime::gen_i2c_adapter(MacroAssembler *masm,
// If this happens, control eventually transfers back to the compiled
// caller, but with an uncorrected stack, causing delayed havoc.
- if (VerifyAdapterCalls &&
- (Interpreter::code() != nullptr || StubRoutines::final_stubs_code() != nullptr)) {
- // So, let's test for cascading c2i/i2c adapters right now.
- // assert(Interpreter::contains($return_addr) ||
- // StubRoutines::contains($return_addr),
- // "i2c adapter must return to an interpreter frame");
- __ block_comment("verify_i2c { ");
- // Pick up the return address
- __ movptr(rax, Address(rsp, 0));
- Label L_ok;
- if (Interpreter::code() != nullptr) {
- range_check(masm, rax, r11,
- Interpreter::code()->code_start(),
- Interpreter::code()->code_end(),
- L_ok);
- }
- if (StubRoutines::initial_stubs_code() != nullptr) {
- range_check(masm, rax, r11,
- StubRoutines::initial_stubs_code()->code_begin(),
- StubRoutines::initial_stubs_code()->code_end(),
- L_ok);
- }
- if (StubRoutines::final_stubs_code() != nullptr) {
- range_check(masm, rax, r11,
- StubRoutines::final_stubs_code()->code_begin(),
- StubRoutines::final_stubs_code()->code_end(),
- L_ok);
- }
- const char* msg = "i2c adapter must return to an interpreter frame";
- __ block_comment(msg);
- __ stop(msg);
- __ bind(L_ok);
- __ block_comment("} verify_i2ce ");
- }
-
// Must preserve original SP for loading incoming arguments because
// we need to align the outgoing SP for compiled code.
__ movptr(r11, rsp);
@@ -1050,12 +1001,12 @@ void SharedRuntime::gen_i2c_adapter(MacroAssembler *masm,
}
// ---------------------------------------------------------------
-AdapterHandlerEntry* SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm,
- int total_args_passed,
- int comp_args_on_stack,
- const BasicType *sig_bt,
- const VMRegPair *regs,
- AdapterFingerPrint* fingerprint) {
+void SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm,
+ int total_args_passed,
+ int comp_args_on_stack,
+ const BasicType *sig_bt,
+ const VMRegPair *regs,
+ AdapterHandlerEntry* handler) {
address i2c_entry = __ pc();
gen_i2c_adapter(masm, total_args_passed, comp_args_on_stack, sig_bt, regs);
@@ -1104,7 +1055,7 @@ AdapterHandlerEntry* SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm
Register klass = rscratch1;
__ load_method_holder(klass, method);
- __ clinit_barrier(klass, r15_thread, &L_skip_barrier /*L_fast_path*/);
+ __ clinit_barrier(klass, &L_skip_barrier /*L_fast_path*/);
__ jump(RuntimeAddress(SharedRuntime::get_handle_wrong_method_stub())); // slow path
@@ -1117,7 +1068,8 @@ AdapterHandlerEntry* SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm
gen_c2i_adapter(masm, total_args_passed, comp_args_on_stack, sig_bt, regs, skip_fixup);
- return AdapterHandlerLibrary::new_entry(fingerprint, i2c_entry, c2i_entry, c2i_unverified_entry, c2i_no_clinit_check_entry);
+ handler->set_entry_points(i2c_entry, c2i_entry, c2i_unverified_entry, c2i_no_clinit_check_entry);
+ return;
}
int SharedRuntime::c_calling_convention(const BasicType *sig_bt,
@@ -2003,7 +1955,7 @@ nmethod* SharedRuntime::generate_native_wrapper(MacroAssembler* masm,
Label L_skip_barrier;
Register klass = r10;
__ mov_metadata(klass, method->method_holder()); // InstanceKlass*
- __ clinit_barrier(klass, r15_thread, &L_skip_barrier /*L_fast_path*/);
+ __ clinit_barrier(klass, &L_skip_barrier /*L_fast_path*/);
__ jump(RuntimeAddress(SharedRuntime::get_handle_wrong_method_stub())); // slow path
@@ -2280,7 +2232,7 @@ nmethod* SharedRuntime::generate_native_wrapper(MacroAssembler* masm,
__ inc_held_monitor_count();
} else {
assert(LockingMode == LM_LIGHTWEIGHT, "must be");
- __ lightweight_lock(lock_reg, obj_reg, swap_reg, r15_thread, rscratch1, slow_path_lock);
+ __ lightweight_lock(lock_reg, obj_reg, swap_reg, rscratch1, slow_path_lock);
}
// Slow path will re-enter here
@@ -2340,7 +2292,7 @@ nmethod* SharedRuntime::generate_native_wrapper(MacroAssembler* masm,
Label Continue;
Label slow_path;
- __ safepoint_poll(slow_path, r15_thread, true /* at_return */, false /* in_nmethod */);
+ __ safepoint_poll(slow_path, true /* at_return */, false /* in_nmethod */);
__ cmpl(Address(r15_thread, JavaThread::suspend_flags_offset()), 0);
__ jcc(Assembler::equal, Continue);
@@ -2431,7 +2383,7 @@ nmethod* SharedRuntime::generate_native_wrapper(MacroAssembler* masm,
__ dec_held_monitor_count();
} else {
assert(LockingMode == LM_LIGHTWEIGHT, "must be");
- __ lightweight_unlock(obj_reg, swap_reg, r15_thread, lock_reg, slow_path_unlock);
+ __ lightweight_unlock(obj_reg, swap_reg, lock_reg, slow_path_unlock);
}
// slow path re-enters here
@@ -2456,7 +2408,6 @@ nmethod* SharedRuntime::generate_native_wrapper(MacroAssembler* masm,
// Unbox oop result, e.g. JNIHandles::resolve value.
if (is_reference_type(ret_type)) {
__ resolve_jobject(rax /* value */,
- r15_thread /* thread */,
rcx /* tmp */);
}
@@ -3234,7 +3185,7 @@ RuntimeStub* SharedRuntime::generate_resolve_blob(SharedStubId id, address desti
__ jcc(Assembler::notEqual, pending);
// get the returned Method*
- __ get_vm_result_2(rbx, r15_thread);
+ __ get_vm_result_metadata(rbx);
__ movptr(Address(rsp, RegisterSaver::rbx_offset_in_bytes()), rbx);
__ movptr(Address(rsp, RegisterSaver::rax_offset_in_bytes()), rax);
@@ -3253,7 +3204,7 @@ RuntimeStub* SharedRuntime::generate_resolve_blob(SharedStubId id, address desti
// exception pending => remove activation and forward to exception handler
- __ movptr(Address(r15_thread, JavaThread::vm_result_offset()), NULL_WORD);
+ __ movptr(Address(r15_thread, JavaThread::vm_result_oop_offset()), NULL_WORD);
__ movptr(rax, Address(r15_thread, Thread::pending_exception_offset()));
__ jump(RuntimeAddress(StubRoutines::forward_exception_entry()));
@@ -3661,7 +3612,7 @@ RuntimeStub* SharedRuntime::generate_jfr_write_checkpoint() {
__ reset_last_Java_frame(true);
// rax is jobject handle result, unpack and process it through a barrier.
- __ resolve_global_jobject(rax, r15_thread, c_rarg0);
+ __ resolve_global_jobject(rax, c_rarg0);
__ leave();
__ ret(0);
diff --git a/src/hotspot/cpu/x86/stubDeclarations_x86.hpp b/src/hotspot/cpu/x86/stubDeclarations_x86.hpp
index 9f6c1ec60ef19..dcb919ddcd097 100644
--- a/src/hotspot/cpu/x86/stubDeclarations_x86.hpp
+++ b/src/hotspot/cpu/x86/stubDeclarations_x86.hpp
@@ -34,58 +34,43 @@
do_stub(initial, verify_mxcsr) \
do_arch_entry(x86, initial, verify_mxcsr, verify_mxcsr_entry, \
verify_mxcsr_entry) \
- LP64_ONLY( \
- do_stub(initial, get_previous_sp) \
- do_arch_entry(x86, initial, get_previous_sp, \
- get_previous_sp_entry, \
- get_previous_sp_entry) \
- do_stub(initial, f2i_fixup) \
- do_arch_entry(x86, initial, f2i_fixup, f2i_fixup, f2i_fixup) \
- do_stub(initial, f2l_fixup) \
- do_arch_entry(x86, initial, f2l_fixup, f2l_fixup, f2l_fixup) \
- do_stub(initial, d2i_fixup) \
- do_arch_entry(x86, initial, d2i_fixup, d2i_fixup, d2i_fixup) \
- do_stub(initial, d2l_fixup) \
- do_arch_entry(x86, initial, d2l_fixup, d2l_fixup, d2l_fixup) \
- do_stub(initial, float_sign_mask) \
- do_arch_entry(x86, initial, float_sign_mask, float_sign_mask, \
- float_sign_mask) \
- do_stub(initial, float_sign_flip) \
- do_arch_entry(x86, initial, float_sign_flip, float_sign_flip, \
- float_sign_flip) \
- do_stub(initial, double_sign_mask) \
- do_arch_entry(x86, initial, double_sign_mask, double_sign_mask, \
- double_sign_mask) \
- do_stub(initial, double_sign_flip) \
- do_arch_entry(x86, initial, double_sign_flip, double_sign_flip, \
- double_sign_flip) \
- ) \
- NOT_LP64( \
- do_stub(initial, verify_fpu_cntrl_word) \
- do_arch_entry(x86, initial, verify_fpu_cntrl_word, \
- verify_fpu_cntrl_wrd_entry, \
- verify_fpu_cntrl_wrd_entry) \
- do_stub(initial, d2i_wrapper) \
- do_arch_entry(x86, initial, d2i_wrapper, d2i_wrapper, \
- d2i_wrapper) \
- do_stub(initial, d2l_wrapper) \
- do_arch_entry(x86, initial, d2l_wrapper, d2l_wrapper, \
- d2l_wrapper) \
- ) \
-
+ do_stub(initial, get_previous_sp) \
+ do_arch_entry(x86, initial, get_previous_sp, \
+ get_previous_sp_entry, \
+ get_previous_sp_entry) \
+ do_stub(initial, f2i_fixup) \
+ do_arch_entry(x86, initial, f2i_fixup, f2i_fixup, f2i_fixup) \
+ do_stub(initial, f2l_fixup) \
+ do_arch_entry(x86, initial, f2l_fixup, f2l_fixup, f2l_fixup) \
+ do_stub(initial, d2i_fixup) \
+ do_arch_entry(x86, initial, d2i_fixup, d2i_fixup, d2i_fixup) \
+ do_stub(initial, d2l_fixup) \
+ do_arch_entry(x86, initial, d2l_fixup, d2l_fixup, d2l_fixup) \
+ do_stub(initial, float_sign_mask) \
+ do_arch_entry(x86, initial, float_sign_mask, float_sign_mask, \
+ float_sign_mask) \
+ do_stub(initial, float_sign_flip) \
+ do_arch_entry(x86, initial, float_sign_flip, float_sign_flip, \
+ float_sign_flip) \
+ do_stub(initial, double_sign_mask) \
+ do_arch_entry(x86, initial, double_sign_mask, double_sign_mask, \
+ double_sign_mask) \
+ do_stub(initial, double_sign_flip) \
+ do_arch_entry(x86, initial, double_sign_flip, double_sign_flip, \
+ double_sign_flip) \
#define STUBGEN_CONTINUATION_BLOBS_ARCH_DO(do_stub, \
do_arch_blob, \
do_arch_entry, \
do_arch_entry_init) \
- do_arch_blob(continuation, 1000 LP64_ONLY(+2000)) \
+ do_arch_blob(continuation, 3000) \
#define STUBGEN_COMPILER_BLOBS_ARCH_DO(do_stub, \
do_arch_blob, \
do_arch_entry, \
do_arch_entry_init) \
- do_arch_blob(compiler, 20000 LP64_ONLY(+64000) WINDOWS_ONLY(+2000)) \
+ do_arch_blob(compiler, 109000 WINDOWS_ONLY(+2000)) \
do_stub(compiler, vector_float_sign_mask) \
do_arch_entry(x86, compiler, vector_float_sign_mask, \
vector_float_sign_mask, vector_float_sign_mask) \
@@ -173,90 +158,88 @@
do_arch_entry(x86, compiler, pshuffle_byte_flip_mask, \
pshuffle_byte_flip_mask_addr, \
pshuffle_byte_flip_mask_addr) \
- LP64_ONLY( \
- /* x86_64 exposes these 3 stubs via a generic entry array */ \
- /* oher arches use arch-specific entries */ \
- /* this really needs rationalising */ \
- do_stub(compiler, string_indexof_linear_ll) \
- do_stub(compiler, string_indexof_linear_uu) \
- do_stub(compiler, string_indexof_linear_ul) \
- do_stub(compiler, pshuffle_byte_flip_mask_sha512) \
- do_arch_entry(x86, compiler, pshuffle_byte_flip_mask_sha512, \
- pshuffle_byte_flip_mask_addr_sha512, \
- pshuffle_byte_flip_mask_addr_sha512) \
- do_stub(compiler, compress_perm_table32) \
- do_arch_entry(x86, compiler, compress_perm_table32, \
- compress_perm_table32, compress_perm_table32) \
- do_stub(compiler, compress_perm_table64) \
- do_arch_entry(x86, compiler, compress_perm_table64, \
- compress_perm_table64, compress_perm_table64) \
- do_stub(compiler, expand_perm_table32) \
- do_arch_entry(x86, compiler, expand_perm_table32, \
- expand_perm_table32, expand_perm_table32) \
- do_stub(compiler, expand_perm_table64) \
- do_arch_entry(x86, compiler, expand_perm_table64, \
- expand_perm_table64, expand_perm_table64) \
- do_stub(compiler, avx2_shuffle_base64) \
- do_arch_entry(x86, compiler, avx2_shuffle_base64, \
- avx2_shuffle_base64, base64_avx2_shuffle_addr) \
- do_stub(compiler, avx2_input_mask_base64) \
- do_arch_entry(x86, compiler, avx2_input_mask_base64, \
- avx2_input_mask_base64, \
- base64_avx2_input_mask_addr) \
- do_stub(compiler, avx2_lut_base64) \
- do_arch_entry(x86, compiler, avx2_lut_base64, \
- avx2_lut_base64, base64_avx2_lut_addr) \
- do_stub(compiler, avx2_decode_tables_base64) \
- do_arch_entry(x86, compiler, avx2_decode_tables_base64, \
- avx2_decode_tables_base64, \
- base64_AVX2_decode_tables_addr) \
- do_stub(compiler, avx2_decode_lut_tables_base64) \
- do_arch_entry(x86, compiler, avx2_decode_lut_tables_base64, \
- avx2_decode_lut_tables_base64, \
- base64_AVX2_decode_LUT_tables_addr) \
- do_stub(compiler, shuffle_base64) \
- do_arch_entry(x86, compiler, shuffle_base64, shuffle_base64, \
- base64_shuffle_addr) \
- do_stub(compiler, lookup_lo_base64) \
- do_arch_entry(x86, compiler, lookup_lo_base64, lookup_lo_base64, \
- base64_vbmi_lookup_lo_addr) \
- do_stub(compiler, lookup_hi_base64) \
- do_arch_entry(x86, compiler, lookup_hi_base64, lookup_hi_base64, \
- base64_vbmi_lookup_hi_addr) \
- do_stub(compiler, lookup_lo_base64url) \
- do_arch_entry(x86, compiler, lookup_lo_base64url, \
- lookup_lo_base64url, \
- base64_vbmi_lookup_lo_url_addr) \
- do_stub(compiler, lookup_hi_base64url) \
- do_arch_entry(x86, compiler, lookup_hi_base64url, \
- lookup_hi_base64url, \
- base64_vbmi_lookup_hi_url_addr) \
- do_stub(compiler, pack_vec_base64) \
- do_arch_entry(x86, compiler, pack_vec_base64, pack_vec_base64, \
- base64_vbmi_pack_vec_addr) \
- do_stub(compiler, join_0_1_base64) \
- do_arch_entry(x86, compiler, join_0_1_base64, join_0_1_base64, \
- base64_vbmi_join_0_1_addr) \
- do_stub(compiler, join_1_2_base64) \
- do_arch_entry(x86, compiler, join_1_2_base64, join_1_2_base64, \
- base64_vbmi_join_1_2_addr) \
- do_stub(compiler, join_2_3_base64) \
- do_arch_entry(x86, compiler, join_2_3_base64, join_2_3_base64, \
- base64_vbmi_join_2_3_addr) \
- do_stub(compiler, encoding_table_base64) \
- do_arch_entry(x86, compiler, encoding_table_base64, \
- encoding_table_base64, base64_encoding_table_addr) \
- do_stub(compiler, decoding_table_base64) \
- do_arch_entry(x86, compiler, decoding_table_base64, \
- decoding_table_base64, base64_decoding_table_addr) \
- ) \
+ /* x86_64 exposes these 3 stubs via a generic entry array */ \
+ /* other arches use arch-specific entries */ \
+ /* this really needs rationalising */ \
+ do_stub(compiler, string_indexof_linear_ll) \
+ do_stub(compiler, string_indexof_linear_uu) \
+ do_stub(compiler, string_indexof_linear_ul) \
+ do_stub(compiler, pshuffle_byte_flip_mask_sha512) \
+ do_arch_entry(x86, compiler, pshuffle_byte_flip_mask_sha512, \
+ pshuffle_byte_flip_mask_addr_sha512, \
+ pshuffle_byte_flip_mask_addr_sha512) \
+ do_stub(compiler, compress_perm_table32) \
+ do_arch_entry(x86, compiler, compress_perm_table32, \
+ compress_perm_table32, compress_perm_table32) \
+ do_stub(compiler, compress_perm_table64) \
+ do_arch_entry(x86, compiler, compress_perm_table64, \
+ compress_perm_table64, compress_perm_table64) \
+ do_stub(compiler, expand_perm_table32) \
+ do_arch_entry(x86, compiler, expand_perm_table32, \
+ expand_perm_table32, expand_perm_table32) \
+ do_stub(compiler, expand_perm_table64) \
+ do_arch_entry(x86, compiler, expand_perm_table64, \
+ expand_perm_table64, expand_perm_table64) \
+ do_stub(compiler, avx2_shuffle_base64) \
+ do_arch_entry(x86, compiler, avx2_shuffle_base64, \
+ avx2_shuffle_base64, base64_avx2_shuffle_addr) \
+ do_stub(compiler, avx2_input_mask_base64) \
+ do_arch_entry(x86, compiler, avx2_input_mask_base64, \
+ avx2_input_mask_base64, \
+ base64_avx2_input_mask_addr) \
+ do_stub(compiler, avx2_lut_base64) \
+ do_arch_entry(x86, compiler, avx2_lut_base64, \
+ avx2_lut_base64, base64_avx2_lut_addr) \
+ do_stub(compiler, avx2_decode_tables_base64) \
+ do_arch_entry(x86, compiler, avx2_decode_tables_base64, \
+ avx2_decode_tables_base64, \
+ base64_AVX2_decode_tables_addr) \
+ do_stub(compiler, avx2_decode_lut_tables_base64) \
+ do_arch_entry(x86, compiler, avx2_decode_lut_tables_base64, \
+ avx2_decode_lut_tables_base64, \
+ base64_AVX2_decode_LUT_tables_addr) \
+ do_stub(compiler, shuffle_base64) \
+ do_arch_entry(x86, compiler, shuffle_base64, shuffle_base64, \
+ base64_shuffle_addr) \
+ do_stub(compiler, lookup_lo_base64) \
+ do_arch_entry(x86, compiler, lookup_lo_base64, lookup_lo_base64, \
+ base64_vbmi_lookup_lo_addr) \
+ do_stub(compiler, lookup_hi_base64) \
+ do_arch_entry(x86, compiler, lookup_hi_base64, lookup_hi_base64, \
+ base64_vbmi_lookup_hi_addr) \
+ do_stub(compiler, lookup_lo_base64url) \
+ do_arch_entry(x86, compiler, lookup_lo_base64url, \
+ lookup_lo_base64url, \
+ base64_vbmi_lookup_lo_url_addr) \
+ do_stub(compiler, lookup_hi_base64url) \
+ do_arch_entry(x86, compiler, lookup_hi_base64url, \
+ lookup_hi_base64url, \
+ base64_vbmi_lookup_hi_url_addr) \
+ do_stub(compiler, pack_vec_base64) \
+ do_arch_entry(x86, compiler, pack_vec_base64, pack_vec_base64, \
+ base64_vbmi_pack_vec_addr) \
+ do_stub(compiler, join_0_1_base64) \
+ do_arch_entry(x86, compiler, join_0_1_base64, join_0_1_base64, \
+ base64_vbmi_join_0_1_addr) \
+ do_stub(compiler, join_1_2_base64) \
+ do_arch_entry(x86, compiler, join_1_2_base64, join_1_2_base64, \
+ base64_vbmi_join_1_2_addr) \
+ do_stub(compiler, join_2_3_base64) \
+ do_arch_entry(x86, compiler, join_2_3_base64, join_2_3_base64, \
+ base64_vbmi_join_2_3_addr) \
+ do_stub(compiler, encoding_table_base64) \
+ do_arch_entry(x86, compiler, encoding_table_base64, \
+ encoding_table_base64, base64_encoding_table_addr) \
+ do_stub(compiler, decoding_table_base64) \
+ do_arch_entry(x86, compiler, decoding_table_base64, \
+ decoding_table_base64, base64_decoding_table_addr) \
#define STUBGEN_FINAL_BLOBS_ARCH_DO(do_stub, \
do_arch_blob, \
do_arch_entry, \
do_arch_entry_init) \
- do_arch_blob(final, 11000 LP64_ONLY(+20000) \
- WINDOWS_ONLY(+22000) ZGC_ONLY(+20000)) \
+ do_arch_blob(final, 31000 \
+ WINDOWS_ONLY(+22000) ZGC_ONLY(+20000)) \
#endif // CPU_X86_STUBDECLARATIONS_HPP
diff --git a/src/hotspot/cpu/x86/stubGenerator_x86_64.cpp b/src/hotspot/cpu/x86/stubGenerator_x86_64.cpp
index 7e63e6fb49b62..1a16416787d6e 100644
--- a/src/hotspot/cpu/x86/stubGenerator_x86_64.cpp
+++ b/src/hotspot/cpu/x86/stubGenerator_x86_64.cpp
@@ -339,7 +339,7 @@ address StubGenerator::generate_call_stub(address& return_address) {
__ jcc(Assembler::equal, L1);
__ stop("StubRoutines::call_stub: r15_thread is corrupted");
__ bind(L1);
- __ get_thread(rbx);
+ __ get_thread_slow(rbx);
__ cmpptr(r15_thread, thread);
__ jcc(Assembler::equal, L2);
__ stop("StubRoutines::call_stub: r15_thread is modified by call");
@@ -426,7 +426,7 @@ address StubGenerator::generate_catch_exception() {
__ jcc(Assembler::equal, L1);
__ stop("StubRoutines::catch_exception: r15_thread is corrupted");
__ bind(L1);
- __ get_thread(rbx);
+ __ get_thread_slow(rbx);
__ cmpptr(r15_thread, thread);
__ jcc(Assembler::equal, L2);
__ stop("StubRoutines::catch_exception: r15_thread is modified by call");
@@ -1313,7 +1313,7 @@ void StubGenerator::setup_arg_regs_using_thread(int nargs) {
__ mov(rax, r9); // r9 is also saved_r15
}
__ mov(saved_r15, r15); // r15 is callee saved and needs to be restored
- __ get_thread(r15_thread);
+ __ get_thread_slow(r15_thread);
assert(c_rarg0 == rcx && c_rarg1 == rdx && c_rarg2 == r8 && c_rarg3 == r9,
"unexpected argument registers");
__ movptr(Address(r15_thread, in_bytes(JavaThread::windows_saved_rdi_offset())), rdi);
@@ -1337,7 +1337,7 @@ void StubGenerator::restore_arg_regs_using_thread() {
assert(_regs_in_thread, "wrong call to restore_arg_regs");
const Register saved_r15 = r9;
#ifdef _WIN64
- __ get_thread(r15_thread);
+ __ get_thread_slow(r15_thread);
__ movptr(rsi, Address(r15_thread, in_bytes(JavaThread::windows_saved_rsi_offset())));
__ movptr(rdi, Address(r15_thread, in_bytes(JavaThread::windows_saved_rdi_offset())));
__ mov(r15, saved_r15); // r15 is callee saved and needs to be restored
@@ -3974,14 +3974,14 @@ address StubGenerator::generate_upcall_stub_load_target() {
StubCodeMark mark(this, stub_id);
address start = __ pc();
- __ resolve_global_jobject(j_rarg0, r15_thread, rscratch1);
+ __ resolve_global_jobject(j_rarg0, rscratch1);
// Load target method from receiver
__ load_heap_oop(rbx, Address(j_rarg0, java_lang_invoke_MethodHandle::form_offset()), rscratch1);
__ load_heap_oop(rbx, Address(rbx, java_lang_invoke_LambdaForm::vmentry_offset()), rscratch1);
__ load_heap_oop(rbx, Address(rbx, java_lang_invoke_MemberName::method_offset()), rscratch1);
__ access_load_at(T_ADDRESS, IN_HEAP, rbx,
Address(rbx, java_lang_invoke_ResolvedMethodName::vmtarget_offset()),
- noreg, noreg);
+ noreg);
__ movptr(Address(r15_thread, JavaThread::callee_target_offset()), rbx); // just in case callee is deoptimized
__ ret(0);
@@ -4204,6 +4204,8 @@ void StubGenerator::generate_compiler_stubs() {
generate_chacha_stubs();
+ generate_dilithium_stubs();
+
generate_sha3_stubs();
// data cache line writeback
@@ -4331,70 +4333,6 @@ void StubGenerator::generate_compiler_stubs() {
}
}
- // Get svml stub routine addresses
- void *libjsvml = nullptr;
- char ebuf[1024];
- char dll_name[JVM_MAXPATHLEN];
- if (os::dll_locate_lib(dll_name, sizeof(dll_name), Arguments::get_dll_dir(), "jsvml")) {
- libjsvml = os::dll_load(dll_name, ebuf, sizeof ebuf);
- }
- if (libjsvml != nullptr) {
- // SVML method naming convention
- // All the methods are named as __jsvml_op_ha_
- // Where:
- // ha stands for high accuracy
- // is optional to indicate float/double
- // Set to f for vector float operation
- // Omitted for vector double operation
- // is the number of elements in the vector
- // 1, 2, 4, 8, 16
- // e.g. 128 bit float vector has 4 float elements
- // indicates the avx/sse level:
- // z0 is AVX512, l9 is AVX2, e9 is AVX1 and ex is for SSE2
- // e.g. __jsvml_expf16_ha_z0 is the method for computing 16 element vector float exp using AVX 512 insns
- // __jsvml_exp8_ha_z0 is the method for computing 8 element vector double exp using AVX 512 insns
-
- log_info(library)("Loaded library %s, handle " INTPTR_FORMAT, JNI_LIB_PREFIX "jsvml" JNI_LIB_SUFFIX, p2i(libjsvml));
- if (UseAVX > 2) {
- for (int op = 0; op < VectorSupport::NUM_VECTOR_OP_MATH; op++) {
- int vop = VectorSupport::VECTOR_OP_MATH_START + op;
- if ((!VM_Version::supports_avx512dq()) &&
- (vop == VectorSupport::VECTOR_OP_LOG || vop == VectorSupport::VECTOR_OP_LOG10 || vop == VectorSupport::VECTOR_OP_POW)) {
- continue;
- }
- snprintf(ebuf, sizeof(ebuf), "__jsvml_%sf16_ha_z0", VectorSupport::mathname[op]);
- StubRoutines::_vector_f_math[VectorSupport::VEC_SIZE_512][op] = (address)os::dll_lookup(libjsvml, ebuf);
-
- snprintf(ebuf, sizeof(ebuf), "__jsvml_%s8_ha_z0", VectorSupport::mathname[op]);
- StubRoutines::_vector_d_math[VectorSupport::VEC_SIZE_512][op] = (address)os::dll_lookup(libjsvml, ebuf);
- }
- }
- const char* avx_sse_str = (UseAVX >= 2) ? "l9" : ((UseAVX == 1) ? "e9" : "ex");
- for (int op = 0; op < VectorSupport::NUM_VECTOR_OP_MATH; op++) {
- int vop = VectorSupport::VECTOR_OP_MATH_START + op;
- if (vop == VectorSupport::VECTOR_OP_POW) {
- continue;
- }
- snprintf(ebuf, sizeof(ebuf), "__jsvml_%sf4_ha_%s", VectorSupport::mathname[op], avx_sse_str);
- StubRoutines::_vector_f_math[VectorSupport::VEC_SIZE_64][op] = (address)os::dll_lookup(libjsvml, ebuf);
-
- snprintf(ebuf, sizeof(ebuf), "__jsvml_%sf4_ha_%s", VectorSupport::mathname[op], avx_sse_str);
- StubRoutines::_vector_f_math[VectorSupport::VEC_SIZE_128][op] = (address)os::dll_lookup(libjsvml, ebuf);
-
- snprintf(ebuf, sizeof(ebuf), "__jsvml_%sf8_ha_%s", VectorSupport::mathname[op], avx_sse_str);
- StubRoutines::_vector_f_math[VectorSupport::VEC_SIZE_256][op] = (address)os::dll_lookup(libjsvml, ebuf);
-
- snprintf(ebuf, sizeof(ebuf), "__jsvml_%s1_ha_%s", VectorSupport::mathname[op], avx_sse_str);
- StubRoutines::_vector_d_math[VectorSupport::VEC_SIZE_64][op] = (address)os::dll_lookup(libjsvml, ebuf);
-
- snprintf(ebuf, sizeof(ebuf), "__jsvml_%s2_ha_%s", VectorSupport::mathname[op], avx_sse_str);
- StubRoutines::_vector_d_math[VectorSupport::VEC_SIZE_128][op] = (address)os::dll_lookup(libjsvml, ebuf);
-
- snprintf(ebuf, sizeof(ebuf), "__jsvml_%s4_ha_%s", VectorSupport::mathname[op], avx_sse_str);
- StubRoutines::_vector_d_math[VectorSupport::VEC_SIZE_256][op] = (address)os::dll_lookup(libjsvml, ebuf);
- }
- }
-
#endif // COMPILER2
#endif // COMPILER2_OR_JVMCI
}
diff --git a/src/hotspot/cpu/x86/stubGenerator_x86_64.hpp b/src/hotspot/cpu/x86/stubGenerator_x86_64.hpp
index 2263188216c41..c08b0168796e4 100644
--- a/src/hotspot/cpu/x86/stubGenerator_x86_64.hpp
+++ b/src/hotspot/cpu/x86/stubGenerator_x86_64.hpp
@@ -489,8 +489,9 @@ class StubGenerator: public StubCodeGenerator {
// SHA3 stubs
void generate_sha3_stubs();
- address generate_sha3_implCompress(StubGenStubId stub_id);
+ // Dilithium stubs and helper functions
+ void generate_dilithium_stubs();
// BASE64 stubs
address base64_shuffle_addr();
diff --git a/src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp b/src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp
index ccc8e456d5717..1e056f15213ee 100644
--- a/src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp
+++ b/src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp
@@ -2476,7 +2476,7 @@ address StubGenerator::generate_checkcast_copy(StubGenStubId stub_id, address *e
#ifdef ASSERT
Label L2;
- __ get_thread(r14);
+ __ get_thread_slow(r14);
__ cmpptr(r15_thread, r14);
__ jcc(Assembler::equal, L2);
__ stop("StubRoutines::call_stub: r15_thread is modified by call");
diff --git a/src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp b/src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp
new file mode 100644
index 0000000000000..7121db2ab9165
--- /dev/null
+++ b/src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp
@@ -0,0 +1,1034 @@
+/*
+ * Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This code is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 only, as
+ * published by the Free Software Foundation.
+ *
+ * This code is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
+ * version 2 for more details (a copy is included in the LICENSE file that
+ * accompanied this code).
+ *
+ * You should have received a copy of the GNU General Public License version
+ * 2 along with this work; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
+ *
+ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
+ * or visit www.oracle.com if you need additional information or have any
+ * questions.
+ *
+ */
+
+#include "asm/assembler.hpp"
+#include "asm/assembler.inline.hpp"
+#include "runtime/stubRoutines.hpp"
+#include "macroAssembler_x86.hpp"
+#include "stubGenerator_x86_64.hpp"
+
+#define __ _masm->
+
+#define xmm(i) as_XMMRegister(i)
+
+#ifdef PRODUCT
+#define BLOCK_COMMENT(str) /* nothing */
+#else
+#define BLOCK_COMMENT(str) __ block_comment(str)
+#endif // PRODUCT
+
+#define BIND(label) bind(label); BLOCK_COMMENT(#label ":")
+
+#define XMMBYTES 64
+
+// Constants
+//
+ATTRIBUTE_ALIGNED(64) static const uint32_t dilithiumAvx512Consts[] = {
+ 58728449, // montQInvModR
+ 8380417, // dilithium_q
+ 2365951, // montRSquareModQ
+ 5373807 // Barrett addend for modular reduction
+};
+
+const int montQInvModRIdx = 0;
+const int dilithium_qIdx = 4;
+const int montRSquareModQIdx = 8;
+const int barrettAddendIdx = 12;
+
+static address dilithiumAvx512ConstsAddr(int offset) {
+ return ((address) dilithiumAvx512Consts) + offset;
+}
+
+const Register scratch = r10;
+const XMMRegister montMulPerm = xmm28;
+const XMMRegister montQInvModR = xmm30;
+const XMMRegister dilithium_q = xmm31;
+
+
+ATTRIBUTE_ALIGNED(64) static const uint32_t dilithiumAvx512Perms[] = {
+ // collect montmul results into the destination register
+ 17, 1, 19, 3, 21, 5, 23, 7, 25, 9, 27, 11, 29, 13, 31, 15,
+ // ntt
+ // level 4
+ 0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23,
+ 8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31,
+ // level 5
+ 0, 1, 2, 3, 16, 17, 18, 19, 8, 9, 10, 11, 24, 25, 26, 27,
+ 4, 5, 6, 7, 20, 21, 22, 23, 12, 13, 14, 15, 28, 29, 30, 31,
+ // level 6
+ 0, 1, 16, 17, 4, 5, 20, 21, 8, 9, 24, 25, 12, 13, 28, 29,
+ 2, 3, 18, 19, 6, 7, 22, 23, 10, 11, 26, 27, 14, 15, 30, 31,
+ // level 7
+ 0, 16, 2, 18, 4, 20, 6, 22, 8, 24, 10, 26, 12, 28, 14, 30,
+ 1, 17, 3, 19, 5, 21, 7, 23, 9, 25, 11, 27, 13, 29, 15, 31,
+ 0, 16, 1, 17, 2, 18, 3, 19, 4, 20, 5, 21, 6, 22, 7, 23,
+ 8, 24, 9, 25, 10, 26, 11, 27, 12, 28, 13, 29, 14, 30, 15, 31,
+
+ // ntt inverse
+ // level 0
+ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30,
+ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31,
+ // level 1
+ 0, 16, 2, 18, 4, 20, 6, 22, 8, 24, 10, 26, 12, 28, 14, 30,
+ 1, 17, 3, 19, 5, 21, 7, 23, 9, 25, 11, 27, 13, 29, 15, 31,
+ // level 2
+ 0, 1, 16, 17, 4, 5, 20, 21, 8, 9, 24, 25, 12, 13, 28, 29,
+ 2, 3, 18, 19, 6, 7, 22, 23, 10, 11, 26, 27, 14, 15, 30, 31,
+ // level 3
+ 0, 1, 2, 3, 16, 17, 18, 19, 8, 9, 10, 11, 24, 25, 26, 27,
+ 4, 5, 6, 7, 20, 21, 22, 23, 12, 13, 14, 15, 28, 29, 30, 31,
+ // level 4
+ 0, 1, 2, 3, 4, 5, 6, 7, 16, 17, 18, 19, 20, 21, 22, 23,
+ 8, 9, 10, 11, 12, 13, 14, 15, 24, 25, 26, 27, 28, 29, 30, 31
+};
+
+const int montMulPermsIdx = 0;
+const int nttL4PermsIdx = 64;
+const int nttL5PermsIdx = 192;
+const int nttL6PermsIdx = 320;
+const int nttL7PermsIdx = 448;
+const int nttInvL0PermsIdx = 704;
+const int nttInvL1PermsIdx = 832;
+const int nttInvL2PermsIdx = 960;
+const int nttInvL3PermsIdx = 1088;
+const int nttInvL4PermsIdx = 1216;
+
+static address dilithiumAvx512PermsAddr() {
+ return (address) dilithiumAvx512Perms;
+}
+
+// We do Montgomery multiplications of two vectors of 16 ints each in 4 steps:
+// 1. Do the multiplications of the corresponding even numbered slots into
+// the odd numbered slots of a third register.
+// 2. Swap the even and odd numbered slots of the original input registers.
+// 3. Similar to step 1, but into a different output register.
+// 4. Combine the outputs of step 1 and step 3 into the output of the Montgomery
+// multiplication.
+// (For levels 0-6 in the Ntt and levels 1-7 of the inverse Ntt we only swap the
+// odd-even slots of the first multiplicand as in the second (zetas) the
+// odd slots contain the same number as the corresponding even one.)
+// The indexes of the registers to be multiplied
+// are in inputRegs1[] and inputRegs[2].
+// The results go to the registers whose indexes are in outputRegs.
+// scratchRegs should contain 12 different register indexes.
+// The set in outputRegs should not overlap with the set of the middle four
+// scratch registers.
+// The sets in inputRegs1 and inputRegs2 cannot overlap with the set of the
+// first eight scratch registers.
+// In most of the cases, the odd and the corresponding even slices of the
+// registers indexed by the numbers in inputRegs2 will contain the same number,
+// this should be indicated by calling this function with
+// input2NeedsShuffle=false .
+//
+static void montMul64(int outputRegs[], int inputRegs1[], int inputRegs2[],
+ int scratchRegs[], bool input2NeedsShuffle,
+ MacroAssembler *_masm) {
+
+ for (int i = 0; i < 4; i++) {
+ __ vpmuldq(xmm(scratchRegs[i]), xmm(inputRegs1[i]), xmm(inputRegs2[i]),
+ Assembler::AVX_512bit);
+ }
+ for (int i = 0; i < 4; i++) {
+ __ vpmulld(xmm(scratchRegs[i + 4]), xmm(scratchRegs[i]), montQInvModR,
+ Assembler::AVX_512bit);
+ }
+ for (int i = 0; i < 4; i++) {
+ __ vpmuldq(xmm(scratchRegs[i + 4]), xmm(scratchRegs[i + 4]), dilithium_q,
+ Assembler::AVX_512bit);
+ }
+ for (int i = 0; i < 4; i++) {
+ __ evpsubd(xmm(scratchRegs[i + 4]), k0, xmm(scratchRegs[i]),
+ xmm(scratchRegs[i + 4]), false, Assembler::AVX_512bit);
+ }
+
+ for (int i = 0; i < 4; i++) {
+ __ vpshufd(xmm(inputRegs1[i]), xmm(inputRegs1[i]), 0xB1,
+ Assembler::AVX_512bit);
+ if (input2NeedsShuffle) {
+ __ vpshufd(xmm(inputRegs2[i]), xmm(inputRegs2[i]), 0xB1,
+ Assembler::AVX_512bit);
+ }
+ }
+
+ for (int i = 0; i < 4; i++) {
+ __ vpmuldq(xmm(scratchRegs[i]), xmm(inputRegs1[i]), xmm(inputRegs2[i]),
+ Assembler::AVX_512bit);
+ }
+ for (int i = 0; i < 4; i++) {
+ __ vpmulld(xmm(scratchRegs[i + 8]), xmm(scratchRegs[i]), montQInvModR,
+ Assembler::AVX_512bit);
+ }
+ for (int i = 0; i < 4; i++) {
+ __ vpmuldq(xmm(scratchRegs[i + 8]), xmm(scratchRegs[i + 8]), dilithium_q,
+ Assembler::AVX_512bit);
+ }
+ for (int i = 0; i < 4; i++) {
+ __ evpsubd(xmm(outputRegs[i]), k0, xmm(scratchRegs[i]),
+ xmm(scratchRegs[i + 8]), false, Assembler::AVX_512bit);
+ }
+
+ for (int i = 0; i < 4; i++) {
+ __ evpermt2d(xmm(outputRegs[i]), montMulPerm, xmm(scratchRegs[i + 4]),
+ Assembler::AVX_512bit);
+ }
+}
+
+static void montMul64(int outputRegs[], int inputRegs1[], int inputRegs2[],
+ int scratchRegs[], MacroAssembler *_masm) {
+ montMul64(outputRegs, inputRegs1, inputRegs2, scratchRegs, false, _masm);
+}
+
+static void sub_add(int subResult[], int addResult[],
+ int input1[], int input2[], MacroAssembler *_masm) {
+
+ for (int i = 0; i < 4; i++) {
+ __ evpsubd(xmm(subResult[i]), k0, xmm(input1[i]), xmm(input2[i]), false,
+ Assembler::AVX_512bit);
+ }
+
+ for (int i = 0; i < 4; i++) {
+ __ evpaddd(xmm(addResult[i]), k0, xmm(input1[i]), xmm(input2[i]), false,
+ Assembler::AVX_512bit);
+ }
+}
+
+static void loadPerm(int destinationRegs[], Register perms,
+ int offset, MacroAssembler *_masm) {
+ __ evmovdqul(xmm(destinationRegs[0]), Address(perms, offset),
+ Assembler::AVX_512bit);
+ for (int i = 1; i < 4; i++) {
+ __ evmovdqul(xmm(destinationRegs[i]), xmm(destinationRegs[0]),
+ Assembler::AVX_512bit);
+ }
+}
+
+static void load4Xmms(int destinationRegs[], Register source, int offset,
+ MacroAssembler *_masm) {
+ for (int i = 0; i < 4; i++) {
+ __ evmovdqul(xmm(destinationRegs[i]), Address(source, offset + i * XMMBYTES),
+ Assembler::AVX_512bit);
+ }
+}
+
+static void loadXmm29(Register source, int offset, MacroAssembler *_masm) {
+ __ evmovdqul(xmm29, Address(source, offset), Assembler::AVX_512bit);
+}
+
+static void store4Xmms(Register destination, int offset, int xmmRegs[],
+ MacroAssembler *_masm) {
+ for (int i = 0; i < 4; i++) {
+ __ evmovdqul(Address(destination, offset + i * XMMBYTES), xmm(xmmRegs[i]),
+ Assembler::AVX_512bit);
+ }
+}
+
+static int xmm0_3[] = {0, 1, 2, 3};
+static int xmm0145[] = {0, 1, 4, 5};
+static int xmm0246[] = {0, 2, 4, 6};
+static int xmm0426[] = {0, 4, 2, 6};
+static int xmm1357[] = {1, 3, 5, 7};
+static int xmm1537[] = {1, 5, 3, 7};
+static int xmm2367[] = {2, 3, 6, 7};
+static int xmm4_7[] = {4, 5, 6, 7};
+static int xmm8_11[] = {8, 9, 10, 11};
+static int xmm12_15[] = {12, 13, 14, 15};
+static int xmm16_19[] = {16, 17, 18, 19};
+static int xmm20_23[] = {20, 21, 22, 23};
+static int xmm20222426[] = {20, 22, 24, 26};
+static int xmm21232527[] = {21, 23, 25, 27};
+static int xmm24_27[] = {24, 25, 26, 27};
+static int xmm4_20_24[] = {4, 5, 6, 7, 20, 21, 22, 23, 24, 25, 26, 27};
+static int xmm16_27[] = {16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27};
+static int xmm29_29[] = {29, 29, 29, 29};
+
+// Dilithium NTT function except for the final "normalization" to |coeff| < Q.
+// Implements
+// static int implDilithiumAlmostNtt(int[] coeffs, int zetas[]) {}
+//
+// coeffs (int[256]) = c_rarg0
+// zetas (int[256]) = c_rarg1
+//
+//
+static address generate_dilithiumAlmostNtt_avx512(StubGenerator *stubgen,
+ MacroAssembler *_masm) {
+
+ __ align(CodeEntryAlignment);
+ StubGenStubId stub_id = dilithiumAlmostNtt_id;
+ StubCodeMark mark(stubgen, stub_id);
+ address start = __ pc();
+ __ enter();
+
+ Label L_loop, L_end;
+
+ const Register coeffs = c_rarg0;
+ const Register zetas = c_rarg1;
+ const Register iterations = c_rarg2;
+
+ const Register perms = r11;
+
+ __ lea(perms, ExternalAddress(dilithiumAvx512PermsAddr()));
+
+ __ evmovdqul(montMulPerm, Address(perms, montMulPermsIdx), Assembler::AVX_512bit);
+
+ // Each level represents one iteration of the outer for loop of the Java version
+ // In each of these iterations half of the coefficients are (Montgomery)
+ // multiplied by a zeta corresponding to the coefficient and then these
+ // products will be added to and subtracted from the other half of the
+ // coefficients. In each level we just collect the coefficients (using
+ // evpermi2d() instructions where necessary, i.e. in levels 4-7) that need to
+ // be multiplied by the zetas in one set, the rest to another set of vector
+ // registers, then redistribute the addition/substraction results.
+
+ // For levels 0 and 1 the zetas are not different within the 4 xmm registers
+ // that we would use for them, so we use only one, xmm29.
+ loadXmm29(zetas, 0, _masm);
+ __ vpbroadcastd(montQInvModR,
+ ExternalAddress(dilithiumAvx512ConstsAddr(montQInvModRIdx)),
+ Assembler::AVX_512bit, scratch); // q^-1 mod 2^32
+ __ vpbroadcastd(dilithium_q,
+ ExternalAddress(dilithiumAvx512ConstsAddr(dilithium_qIdx)),
+ Assembler::AVX_512bit, scratch); // q
+
+ // load all coefficients into the vector registers Zmm_0-Zmm_15,
+ // 16 coefficients into each
+ load4Xmms(xmm0_3, coeffs, 0, _masm);
+ load4Xmms(xmm4_7, coeffs, 4 * XMMBYTES, _masm);
+ load4Xmms(xmm8_11, coeffs, 8 * XMMBYTES, _masm);
+ load4Xmms(xmm12_15, coeffs, 12 * XMMBYTES, _masm);
+
+ // level 0 and 1 can be done entirely in registers as the zetas on these
+ // levels are the same for all the montmuls that we can do in parallel
+
+ // level 0
+ montMul64(xmm16_19, xmm8_11, xmm29_29, xmm16_27, _masm);
+ sub_add(xmm8_11, xmm0_3, xmm0_3, xmm16_19, _masm);
+ montMul64(xmm16_19, xmm12_15, xmm29_29, xmm16_27, _masm);
+ loadXmm29(zetas, 512, _masm); // for level 1
+ sub_add(xmm12_15, xmm4_7, xmm4_7, xmm16_19, _masm);
+
+ // level 1
+
+ montMul64(xmm16_19, xmm4_7, xmm29_29, xmm16_27, _masm);
+ loadXmm29(zetas, 768, _masm);
+ sub_add(xmm4_7, xmm0_3, xmm0_3, xmm16_19, _masm);
+ montMul64(xmm16_19, xmm12_15, xmm29_29, xmm16_27, _masm);
+ sub_add(xmm12_15, xmm8_11, xmm8_11, xmm16_19, _masm);
+
+ // levels 2 to 7 are done in 2 batches, by first saving half of the coefficients
+ // from level 1 into memory, doing all the level 2 to level 7 computations
+ // on the remaining half in the vector registers, saving the result to
+ // memory after level 7, then loading back the coefficients that we saved after
+ // level 1 and do the same computation with those
+
+ store4Xmms(coeffs, 8 * XMMBYTES, xmm8_11, _masm);
+ store4Xmms(coeffs, 12 * XMMBYTES, xmm12_15, _masm);
+
+ __ movl(iterations, 2);
+
+ __ align(OptoLoopAlignment);
+ __ BIND(L_loop);
+
+ __ subl(iterations, 1);
+
+ // level 2
+ load4Xmms(xmm12_15, zetas, 2 * 512, _masm);
+ montMul64(xmm16_19, xmm2367, xmm12_15, xmm16_27, _masm);
+ load4Xmms(xmm12_15, zetas, 3 * 512, _masm); // for level 3
+ sub_add(xmm2367, xmm0145, xmm0145, xmm16_19, _masm);
+
+ // level 3
+
+ montMul64(xmm16_19, xmm1357, xmm12_15, xmm16_27, _masm);
+ sub_add(xmm1357, xmm0246, xmm0246, xmm16_19, _masm);
+
+ // level 4
+ loadPerm(xmm16_19, perms, nttL4PermsIdx, _masm);
+ loadPerm(xmm12_15, perms, nttL4PermsIdx + 64, _masm);
+ load4Xmms(xmm24_27, zetas, 4 * 512, _masm);
+
+ for (int i = 0; i < 8; i += 2) {
+ __ evpermi2d(xmm(i/2 + 16), xmm(i), xmm(i + 1), Assembler::AVX_512bit);
+ }
+ for (int i = 0; i < 8; i += 2) {
+ __ evpermi2d(xmm(i / 2 + 12), xmm(i), xmm(i + 1), Assembler::AVX_512bit);
+ }
+
+ montMul64(xmm12_15, xmm12_15, xmm24_27, xmm4_20_24, _masm);
+ sub_add(xmm1357, xmm0246, xmm16_19, xmm12_15, _masm);
+
+ // level 5
+ loadPerm(xmm16_19, perms, nttL5PermsIdx, _masm);
+ loadPerm(xmm12_15, perms, nttL5PermsIdx + 64, _masm);
+ load4Xmms(xmm24_27, zetas, 5 * 512, _masm);
+
+ for (int i = 0; i < 8; i += 2) {
+ __ evpermi2d(xmm(i/2 + 16), xmm(i), xmm(i + 1), Assembler::AVX_512bit);
+ }
+ for (int i = 0; i < 8; i += 2) {
+ __ evpermi2d(xmm(i / 2 + 12), xmm(i), xmm(i + 1), Assembler::AVX_512bit);
+ }
+
+ montMul64(xmm12_15, xmm12_15, xmm24_27, xmm4_20_24, _masm);
+ sub_add(xmm1357, xmm0246, xmm16_19, xmm12_15, _masm);
+
+ // level 6
+ loadPerm(xmm16_19, perms, nttL6PermsIdx, _masm);
+ loadPerm(xmm12_15, perms, nttL6PermsIdx + 64, _masm);
+ load4Xmms(xmm24_27, zetas, 6 * 512, _masm);
+
+ for (int i = 0; i < 8; i += 2) {
+ __ evpermi2d(xmm(i/2 + 16), xmm(i), xmm(i + 1), Assembler::AVX_512bit);
+ }
+ for (int i = 0; i < 8; i += 2) {
+ __ evpermi2d(xmm(i / 2 + 12), xmm(i), xmm(i + 1), Assembler::AVX_512bit);
+ }
+
+ montMul64(xmm12_15, xmm12_15, xmm24_27, xmm4_20_24, _masm);
+ sub_add(xmm1357, xmm0246, xmm16_19, xmm12_15, _masm);
+
+ // level 7
+ loadPerm(xmm16_19, perms, nttL7PermsIdx, _masm);
+ loadPerm(xmm12_15, perms, nttL7PermsIdx + 64, _masm);
+ load4Xmms(xmm24_27, zetas, 7 * 512, _masm);
+
+ for (int i = 0; i < 8; i += 2) {
+ __ evpermi2d(xmm(i / 2 + 16), xmm(i), xmm(i + 1), Assembler::AVX_512bit);
+ }
+ for (int i = 0; i < 8; i += 2) {
+ __ evpermi2d(xmm(i / 2 + 12), xmm(i), xmm(i + 1), Assembler::AVX_512bit);
+ }
+
+ montMul64(xmm12_15, xmm12_15, xmm24_27, xmm4_20_24, true, _masm);
+ loadPerm(xmm0246, perms, nttL7PermsIdx + 2 * XMMBYTES, _masm);
+ loadPerm(xmm1357, perms, nttL7PermsIdx + 3 * XMMBYTES, _masm);
+ sub_add(xmm21232527, xmm20222426, xmm16_19, xmm12_15, _masm);
+
+ for (int i = 0; i < 8; i += 2) {
+ __ evpermi2d(xmm(i), xmm(i + 20), xmm(i + 21), Assembler::AVX_512bit);
+ __ evpermi2d(xmm(i + 1), xmm(i + 20), xmm(i + 21), Assembler::AVX_512bit);
+ }
+
+ __ cmpl(iterations, 0);
+ __ jcc(Assembler::equal, L_end);
+
+ store4Xmms(coeffs, 0, xmm0_3, _masm);
+ store4Xmms(coeffs, 4 * XMMBYTES, xmm4_7, _masm);
+
+ load4Xmms(xmm0_3, coeffs, 8 * XMMBYTES, _masm);
+ load4Xmms(xmm4_7, coeffs, 12 * XMMBYTES, _masm);
+
+ __ addptr(zetas, 4 * XMMBYTES);
+
+ __ jmp(L_loop);
+
+ __ BIND(L_end);
+
+ store4Xmms(coeffs, 8 * XMMBYTES, xmm0_3, _masm);
+ store4Xmms(coeffs, 12 * XMMBYTES, xmm4_7, _masm);
+
+ __ leave(); // required for proper stackwalking of RuntimeStub frame
+ __ mov64(rax, 0); // return 0
+ __ ret(0);
+
+ return start;
+}
+
+// Dilithium Inverse NTT function except the final mod Q division by 2^256.
+// Implements
+// static int implDilithiumAlmostInverseNtt(int[] coeffs, int[] zetas) {}
+//
+// coeffs (int[256]) = c_rarg0
+// zetas (int[256]) = c_rarg1
+static address generate_dilithiumAlmostInverseNtt_avx512(StubGenerator *stubgen,
+ MacroAssembler *_masm) {
+
+ __ align(CodeEntryAlignment);
+ StubGenStubId stub_id = dilithiumAlmostInverseNtt_id;
+ StubCodeMark mark(stubgen, stub_id);
+ address start = __ pc();
+ __ enter();
+
+ Label L_loop, L_end;
+
+ const Register coeffs = c_rarg0;
+ const Register zetas = c_rarg1;
+
+ const Register iterations = c_rarg2;
+
+ const Register perms = r11;
+
+ __ lea(perms, ExternalAddress(dilithiumAvx512PermsAddr()));
+
+ __ evmovdqul(montMulPerm, Address(perms, montMulPermsIdx), Assembler::AVX_512bit);
+ __ vpbroadcastd(montQInvModR,
+ ExternalAddress(dilithiumAvx512ConstsAddr(montQInvModRIdx)),
+ Assembler::AVX_512bit, scratch); // q^-1 mod 2^32
+ __ vpbroadcastd(dilithium_q,
+ ExternalAddress(dilithiumAvx512ConstsAddr(dilithium_qIdx)),
+ Assembler::AVX_512bit, scratch); // q
+
+ // Each level represents one iteration of the outer for loop of the
+ // Java version.
+ // In each of these iterations half of the coefficients are added to and
+ // subtracted from the other half of the coefficients then the result of
+ // the substartion is (Montgomery) multiplied by the corresponding zetas.
+ // In each level we just collect the coefficients (using evpermi2d()
+ // instructions where necessary, i.e. on levels 0-4) so that the results of
+ // the additions and subtractions go to the vector registers so that they
+ // align with each other and the zetas.
+
+ // We do levels 0-6 in two batches, each batch entirely in the vector registers
+ load4Xmms(xmm0_3, coeffs, 0, _masm);
+ load4Xmms(xmm4_7, coeffs, 4 * XMMBYTES, _masm);
+
+ __ movl(iterations, 2);
+
+ __ align(OptoLoopAlignment);
+ __ BIND(L_loop);
+
+ __ subl(iterations, 1);
+
+ // level 0
+ loadPerm(xmm8_11, perms, nttInvL0PermsIdx, _masm);
+ loadPerm(xmm12_15, perms, nttInvL0PermsIdx + 64, _masm);
+
+ for (int i = 0; i < 8; i += 2) {
+ __ evpermi2d(xmm(i / 2 + 8), xmm(i), xmm(i + 1), Assembler::AVX_512bit);
+ __ evpermi2d(xmm(i / 2 + 12), xmm(i), xmm(i + 1), Assembler::AVX_512bit);
+ }
+
+ load4Xmms(xmm4_7, zetas, 0, _masm);
+ sub_add(xmm24_27, xmm0_3, xmm8_11, xmm12_15, _masm);
+ montMul64(xmm4_7, xmm4_7, xmm24_27, xmm16_27, true, _masm);
+
+ // level 1
+ loadPerm(xmm8_11, perms, nttInvL1PermsIdx, _masm);
+ loadPerm(xmm12_15, perms, nttInvL1PermsIdx + 64, _masm);
+
+ for (int i = 0; i < 4; i++) {
+ __ evpermi2d(xmm(i + 8), xmm(i), xmm(i + 4), Assembler::AVX_512bit);
+ __ evpermi2d(xmm(i + 12), xmm(i), xmm(i + 4), Assembler::AVX_512bit);
+ }
+
+ load4Xmms(xmm4_7, zetas, 512, _masm);
+ sub_add(xmm24_27, xmm0_3, xmm8_11, xmm12_15, _masm);
+ montMul64(xmm4_7, xmm24_27, xmm4_7, xmm16_27, _masm);
+
+ // level 2
+ loadPerm(xmm8_11, perms, nttInvL2PermsIdx, _masm);
+ loadPerm(xmm12_15, perms, nttInvL2PermsIdx + 64, _masm);
+
+ for (int i = 0; i < 4; i++) {
+ __ evpermi2d(xmm(i + 8), xmm(i), xmm(i + 4), Assembler::AVX_512bit);
+ __ evpermi2d(xmm(i + 12), xmm(i), xmm(i + 4), Assembler::AVX_512bit);
+ }
+
+ load4Xmms(xmm4_7, zetas, 2 * 512, _masm);
+ sub_add(xmm24_27, xmm0_3, xmm8_11, xmm12_15, _masm);
+ montMul64(xmm4_7, xmm24_27, xmm4_7, xmm16_27, _masm);
+
+ // level 3
+ loadPerm(xmm8_11, perms, nttInvL3PermsIdx, _masm);
+ loadPerm(xmm12_15, perms, nttInvL3PermsIdx + 64, _masm);
+
+ for (int i = 0; i < 4; i++) {
+ __ evpermi2d(xmm(i + 8), xmm(i), xmm(i + 4), Assembler::AVX_512bit);
+ __ evpermi2d(xmm(i + 12), xmm(i), xmm(i + 4), Assembler::AVX_512bit);
+ }
+
+ load4Xmms(xmm4_7, zetas, 3 * 512, _masm);
+ sub_add(xmm24_27, xmm0_3, xmm8_11, xmm12_15, _masm);
+ montMul64(xmm4_7, xmm24_27, xmm4_7, xmm16_27, _masm);
+
+ // level 4
+ loadPerm(xmm8_11, perms, nttInvL4PermsIdx, _masm);
+ loadPerm(xmm12_15, perms, nttInvL4PermsIdx + 64, _masm);
+
+ for (int i = 0; i < 4; i++) {
+ __ evpermi2d(xmm(i + 8), xmm(i), xmm(i + 4), Assembler::AVX_512bit);
+ __ evpermi2d(xmm(i + 12), xmm(i), xmm(i + 4), Assembler::AVX_512bit);
+ }
+
+ load4Xmms(xmm4_7, zetas, 4 * 512, _masm);
+ sub_add(xmm24_27, xmm0_3, xmm8_11, xmm12_15, _masm);
+ montMul64(xmm4_7, xmm24_27, xmm4_7, xmm16_27, _masm);
+
+ // level 5
+ load4Xmms(xmm12_15, zetas, 5 * 512, _masm);
+ sub_add(xmm8_11, xmm0_3, xmm0426, xmm1537, _masm);
+ montMul64(xmm4_7, xmm8_11, xmm12_15, xmm16_27, _masm);
+
+ // level 6
+ load4Xmms(xmm12_15, zetas, 6 * 512, _masm);
+ sub_add(xmm8_11, xmm0_3, xmm0145, xmm2367, _masm);
+ montMul64(xmm4_7, xmm8_11, xmm12_15, xmm16_27, _masm);
+
+ __ cmpl(iterations, 0);
+ __ jcc(Assembler::equal, L_end);
+
+ // save the coefficients of the first batch, adjust the zetas
+ // and load the second batch of coefficients
+ store4Xmms(coeffs, 0, xmm0_3, _masm);
+ store4Xmms(coeffs, 4 * XMMBYTES, xmm4_7, _masm);
+
+ __ addptr(zetas, 4 * XMMBYTES);
+
+ load4Xmms(xmm0_3, coeffs, 8 * XMMBYTES, _masm);
+ load4Xmms(xmm4_7, coeffs, 12 * XMMBYTES, _masm);
+
+ __ jmp(L_loop);
+
+ __ BIND(L_end);
+
+ // load the coeffs of the first batch of coefficients that were saved after
+ // level 6 into Zmm_8-Zmm_15 and do the last level entirely in the vector
+ // registers
+ load4Xmms(xmm8_11, coeffs, 0, _masm);
+ load4Xmms(xmm12_15, coeffs, 4 * XMMBYTES, _masm);
+
+ // level 7
+
+ loadXmm29(zetas, 7 * 512, _masm);
+
+ for (int i = 0; i < 8; i++) {
+ __ evpaddd(xmm(i + 16), k0, xmm(i), xmm(i + 8), false, Assembler::AVX_512bit);
+ }
+
+ for (int i = 0; i < 8; i++) {
+ __ evpsubd(xmm(i), k0, xmm(i + 8), xmm(i), false, Assembler::AVX_512bit);
+ }
+
+ store4Xmms(coeffs, 0, xmm16_19, _masm);
+ store4Xmms(coeffs, 4 * XMMBYTES, xmm20_23, _masm);
+ montMul64(xmm0_3, xmm0_3, xmm29_29, xmm16_27, _masm);
+ montMul64(xmm4_7, xmm4_7, xmm29_29, xmm16_27, _masm);
+ store4Xmms(coeffs, 8 * XMMBYTES, xmm0_3, _masm);
+ store4Xmms(coeffs, 12 * XMMBYTES, xmm4_7, _masm);
+
+ __ leave(); // required for proper stackwalking of RuntimeStub frame
+ __ mov64(rax, 0); // return 0
+ __ ret(0);
+
+ return start;
+}
+
+// Dilithium multiply polynomials in the NTT domain.
+// Implements
+// static int implDilithiumNttMult(
+// int[] result, int[] ntta, int[] nttb {}
+//
+// result (int[256]) = c_rarg0
+// poly1 (int[256]) = c_rarg1
+// poly2 (int[256]) = c_rarg2
+static address generate_dilithiumNttMult_avx512(StubGenerator *stubgen,
+ MacroAssembler *_masm) {
+
+ __ align(CodeEntryAlignment);
+ StubGenStubId stub_id = dilithiumNttMult_id;
+ StubCodeMark mark(stubgen, stub_id);
+ address start = __ pc();
+ __ enter();
+
+ Label L_loop;
+
+ const Register result = c_rarg0;
+ const Register poly1 = c_rarg1;
+ const Register poly2 = c_rarg2;
+
+ const Register perms = r10; // scratch reused after not needed any more
+ const Register len = r11;
+
+ const XMMRegister montRSquareModQ = xmm29;
+
+ __ vpbroadcastd(montQInvModR,
+ ExternalAddress(dilithiumAvx512ConstsAddr(montQInvModRIdx)),
+ Assembler::AVX_512bit, scratch); // q^-1 mod 2^32
+ __ vpbroadcastd(dilithium_q,
+ ExternalAddress(dilithiumAvx512ConstsAddr(dilithium_qIdx)),
+ Assembler::AVX_512bit, scratch); // q
+ __ vpbroadcastd(montRSquareModQ,
+ ExternalAddress(dilithiumAvx512ConstsAddr(montRSquareModQIdx)),
+ Assembler::AVX_512bit, scratch); // 2^64 mod q
+
+ __ lea(perms, ExternalAddress(dilithiumAvx512PermsAddr()));
+ __ evmovdqul(montMulPerm, Address(perms, montMulPermsIdx), Assembler::AVX_512bit);
+
+ __ movl(len, 4);
+
+ __ align(OptoLoopAlignment);
+ __ BIND(L_loop);
+
+ load4Xmms(xmm4_7, poly2, 0, _masm);
+ load4Xmms(xmm0_3, poly1, 0, _masm);
+ montMul64(xmm4_7, xmm4_7, xmm29_29, xmm16_27, _masm);
+ montMul64(xmm0_3, xmm0_3, xmm4_7, xmm16_27, true, _masm);
+ store4Xmms(result, 0, xmm0_3, _masm);
+
+ __ subl(len, 1);
+ __ addptr(poly1, 4 * XMMBYTES);
+ __ addptr(poly2, 4 * XMMBYTES);
+ __ addptr(result, 4 * XMMBYTES);
+ __ cmpl(len, 0);
+ __ jcc(Assembler::notEqual, L_loop);
+
+ __ leave(); // required for proper stackwalking of RuntimeStub frame
+ __ mov64(rax, 0); // return 0
+ __ ret(0);
+
+ return start;
+}
+
+// Dilithium Motgomery multiply an array by a constant.
+// Implements
+// static int implDilithiumMontMulByConstant(int[] coeffs, int constant) {}
+//
+// coeffs (int[256]) = c_rarg0
+// constant (int) = c_rarg1
+static address generate_dilithiumMontMulByConstant_avx512(StubGenerator *stubgen,
+ MacroAssembler *_masm) {
+
+ __ align(CodeEntryAlignment);
+ StubGenStubId stub_id = dilithiumMontMulByConstant_id;
+ StubCodeMark mark(stubgen, stub_id);
+ address start = __ pc();
+ __ enter();
+
+ Label L_loop;
+
+ const Register coeffs = c_rarg0;
+ const Register rConstant = c_rarg1;
+
+ const Register perms = c_rarg2; // not used for argument
+ const Register len = r11;
+
+ const XMMRegister constant = xmm29;
+
+ __ lea(perms, ExternalAddress(dilithiumAvx512PermsAddr()));
+
+ // the following four vector registers are used in montMul64
+ __ vpbroadcastd(montQInvModR,
+ ExternalAddress(dilithiumAvx512ConstsAddr(montQInvModRIdx)),
+ Assembler::AVX_512bit, scratch); // q^-1 mod 2^32
+ __ vpbroadcastd(dilithium_q,
+ ExternalAddress(dilithiumAvx512ConstsAddr(dilithium_qIdx)),
+ Assembler::AVX_512bit, scratch); // q
+ __ evmovdqul(montMulPerm, Address(perms, montMulPermsIdx), Assembler::AVX_512bit);
+ __ evpbroadcastd(constant, rConstant, Assembler::AVX_512bit); // constant multiplier
+
+ __ movl(len, 2);
+
+ __ align(OptoLoopAlignment);
+ __ BIND(L_loop);
+
+ load4Xmms(xmm0_3, coeffs, 0, _masm);
+ load4Xmms(xmm4_7, coeffs, 4 * XMMBYTES, _masm);
+ montMul64(xmm0_3, xmm0_3, xmm29_29, xmm16_27, _masm);
+ montMul64(xmm4_7, xmm4_7, xmm29_29, xmm16_27, _masm);
+ store4Xmms(coeffs, 0, xmm0_3, _masm);
+ store4Xmms(coeffs, 4 * XMMBYTES, xmm4_7, _masm);
+
+ __ subl(len, 1);
+ __ addptr(coeffs, 512);
+ __ cmpl(len, 0);
+ __ jcc(Assembler::notEqual, L_loop);
+
+ __ leave(); // required for proper stackwalking of RuntimeStub frame
+ __ mov64(rax, 0); // return 0
+ __ ret(0);
+
+ return start;
+}
+
+// Dilithium decompose poly.
+// Implements
+// static int implDilithiumDecomposePoly(int[] coeffs, int constant) {}
+//
+// input (int[256]) = c_rarg0
+// lowPart (int[256]) = c_rarg1
+// highPart (int[256]) = c_rarg2
+// twoGamma2 (int) = c_rarg3
+// multiplier (int) = c_rarg4
+static address generate_dilithiumDecomposePoly_avx512(StubGenerator *stubgen,
+ MacroAssembler *_masm) {
+
+ __ align(CodeEntryAlignment);
+ StubGenStubId stub_id = dilithiumDecomposePoly_id;
+ StubCodeMark mark(stubgen, stub_id);
+ address start = __ pc();
+ __ enter();
+
+ Label L_loop;
+
+ const Register input = c_rarg0;
+ const Register lowPart = c_rarg1;
+ const Register highPart = c_rarg2;
+ const Register rTwoGamma2 = c_rarg3;
+
+ const Register len = r11;
+ const XMMRegister zero = xmm24;
+ const XMMRegister one = xmm25;
+ const XMMRegister qMinus1 = xmm26;
+ const XMMRegister gamma2 = xmm27;
+ const XMMRegister twoGamma2 = xmm28;
+ const XMMRegister barrettMultiplier = xmm29;
+ const XMMRegister barrettAddend = xmm30;
+
+ __ vpxor(zero, zero, zero, Assembler::AVX_512bit); // 0
+ __ vpternlogd(xmm0, 0xff, xmm0, xmm0, Assembler::AVX_512bit); // -1
+ __ vpsubd(one, zero, xmm0, Assembler::AVX_512bit); // 1
+ __ vpbroadcastd(dilithium_q,
+ ExternalAddress(dilithiumAvx512ConstsAddr(dilithium_qIdx)),
+ Assembler::AVX_512bit, scratch); // q
+ __ vpbroadcastd(barrettAddend,
+ ExternalAddress(dilithiumAvx512ConstsAddr(barrettAddendIdx)),
+ Assembler::AVX_512bit, scratch); // addend for Barrett reduction
+
+ __ evpbroadcastd(twoGamma2, rTwoGamma2, Assembler::AVX_512bit); // 2 * gamma2
+
+ #ifndef _WIN64
+ const Register rMultiplier = c_rarg4;
+ #else
+ const Address multiplier_mem(rbp, 6 * wordSize);
+ const Register rMultiplier = c_rarg3; // arg3 is already consumed, reused here
+ __ movptr(rMultiplier, multiplier_mem);
+ #endif
+ __ evpbroadcastd(barrettMultiplier, rMultiplier,
+ Assembler::AVX_512bit); // multiplier for mod 2 * gamma2 reduce
+
+ __ evpsubd(qMinus1, k0, dilithium_q, one, false, Assembler::AVX_512bit); // q - 1
+ __ evpsrad(gamma2, k0, twoGamma2, 1, false, Assembler::AVX_512bit); // gamma2
+
+ __ movl(len, 1024);
+
+ __ align(OptoLoopAlignment);
+ __ BIND(L_loop);
+
+ load4Xmms(xmm0_3, input, 0, _masm);
+
+ __ addptr(input, 4 * XMMBYTES);
+
+ // rplus in xmm0
+ // rplus = rplus - ((rplus + 5373807) >> 23) * dilithium_q;
+ __ evpaddd(xmm4, k0, xmm0, barrettAddend, false, Assembler::AVX_512bit);
+ __ evpaddd(xmm5, k0, xmm1, barrettAddend, false, Assembler::AVX_512bit);
+ __ evpaddd(xmm6, k0, xmm2, barrettAddend, false, Assembler::AVX_512bit);
+ __ evpaddd(xmm7, k0, xmm3, barrettAddend, false, Assembler::AVX_512bit);
+
+ __ evpsrad(xmm4, k0, xmm4, 23, false, Assembler::AVX_512bit);
+ __ evpsrad(xmm5, k0, xmm5, 23, false, Assembler::AVX_512bit);
+ __ evpsrad(xmm6, k0, xmm6, 23, false, Assembler::AVX_512bit);
+ __ evpsrad(xmm7, k0, xmm7, 23, false, Assembler::AVX_512bit);
+
+ __ evpmulld(xmm4, k0, xmm4, dilithium_q, false, Assembler::AVX_512bit);
+ __ evpmulld(xmm5, k0, xmm5, dilithium_q, false, Assembler::AVX_512bit);
+ __ evpmulld(xmm6, k0, xmm6, dilithium_q, false, Assembler::AVX_512bit);
+ __ evpmulld(xmm7, k0, xmm7, dilithium_q, false, Assembler::AVX_512bit);
+
+ __ evpsubd(xmm0, k0, xmm0, xmm4, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm1, k0, xmm1, xmm5, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm2, k0, xmm2, xmm6, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm3, k0, xmm3, xmm7, false, Assembler::AVX_512bit);
+ // rplus in xmm0
+ // rplus = rplus + ((rplus >> 31) & dilithium_q);
+ __ evpsrad(xmm4, k0, xmm0, 31, false, Assembler::AVX_512bit);
+ __ evpsrad(xmm5, k0, xmm1, 31, false, Assembler::AVX_512bit);
+ __ evpsrad(xmm6, k0, xmm2, 31, false, Assembler::AVX_512bit);
+ __ evpsrad(xmm7, k0, xmm3, 31, false, Assembler::AVX_512bit);
+
+ __ evpandd(xmm4, k0, xmm4, dilithium_q, false, Assembler::AVX_512bit);
+ __ evpandd(xmm5, k0, xmm5, dilithium_q, false, Assembler::AVX_512bit);
+ __ evpandd(xmm6, k0, xmm6, dilithium_q, false, Assembler::AVX_512bit);
+ __ evpandd(xmm7, k0, xmm7, dilithium_q, false, Assembler::AVX_512bit);
+
+ __ evpaddd(xmm0, k0, xmm0, xmm4, false, Assembler::AVX_512bit);
+ __ evpaddd(xmm1, k0, xmm1, xmm5, false, Assembler::AVX_512bit);
+ __ evpaddd(xmm2, k0, xmm2, xmm6, false, Assembler::AVX_512bit);
+ __ evpaddd(xmm3, k0, xmm3, xmm7, false, Assembler::AVX_512bit);
+ // rplus in xmm0
+ // int quotient = (rplus * barrettMultiplier) >> 22;
+ __ evpmulld(xmm4, k0, xmm0, barrettMultiplier, false, Assembler::AVX_512bit);
+ __ evpmulld(xmm5, k0, xmm1, barrettMultiplier, false, Assembler::AVX_512bit);
+ __ evpmulld(xmm6, k0, xmm2, barrettMultiplier, false, Assembler::AVX_512bit);
+ __ evpmulld(xmm7, k0, xmm3, barrettMultiplier, false, Assembler::AVX_512bit);
+
+ __ evpsrad(xmm4, k0, xmm4, 22, false, Assembler::AVX_512bit);
+ __ evpsrad(xmm5, k0, xmm5, 22, false, Assembler::AVX_512bit);
+ __ evpsrad(xmm6, k0, xmm6, 22, false, Assembler::AVX_512bit);
+ __ evpsrad(xmm7, k0, xmm7, 22, false, Assembler::AVX_512bit);
+ // quotient in xmm4
+ // int r0 = rplus - quotient * twoGamma2;
+ __ evpmulld(xmm8, k0, xmm4, twoGamma2, false, Assembler::AVX_512bit);
+ __ evpmulld(xmm9, k0, xmm5, twoGamma2, false, Assembler::AVX_512bit);
+ __ evpmulld(xmm10, k0, xmm6, twoGamma2, false, Assembler::AVX_512bit);
+ __ evpmulld(xmm11, k0, xmm7, twoGamma2, false, Assembler::AVX_512bit);
+
+ __ evpsubd(xmm8, k0, xmm0, xmm8, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm9, k0, xmm1, xmm9, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm10, k0, xmm2, xmm10, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm11, k0, xmm3, xmm11, false, Assembler::AVX_512bit);
+ // r0 in xmm8
+ // int mask = (twoGamma2 - r0) >> 22;
+ __ evpsubd(xmm12, k0, twoGamma2, xmm8, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm13, k0, twoGamma2, xmm9, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm14, k0, twoGamma2, xmm10, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm15, k0, twoGamma2, xmm11, false, Assembler::AVX_512bit);
+
+ __ evpsrad(xmm12, k0, xmm12, 22, false, Assembler::AVX_512bit);
+ __ evpsrad(xmm13, k0, xmm13, 22, false, Assembler::AVX_512bit);
+ __ evpsrad(xmm14, k0, xmm14, 22, false, Assembler::AVX_512bit);
+ __ evpsrad(xmm15, k0, xmm15, 22, false, Assembler::AVX_512bit);
+ // mask in xmm12
+ // r0 -= (mask & twoGamma2);
+ __ evpandd(xmm16, k0, xmm12, twoGamma2, false, Assembler::AVX_512bit);
+ __ evpandd(xmm17, k0, xmm13, twoGamma2, false, Assembler::AVX_512bit);
+ __ evpandd(xmm18, k0, xmm14, twoGamma2, false, Assembler::AVX_512bit);
+ __ evpandd(xmm19, k0, xmm15, twoGamma2, false, Assembler::AVX_512bit);
+
+ __ evpsubd(xmm8, k0, xmm8, xmm16, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm9, k0, xmm9, xmm17, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm10, k0, xmm10, xmm18, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm11, k0, xmm11, xmm19, false, Assembler::AVX_512bit);
+ // r0 in xmm8
+ // quotient += (mask & 1);
+ __ evpandd(xmm16, k0, xmm12, one, false, Assembler::AVX_512bit);
+ __ evpandd(xmm17, k0, xmm13, one, false, Assembler::AVX_512bit);
+ __ evpandd(xmm18, k0, xmm14, one, false, Assembler::AVX_512bit);
+ __ evpandd(xmm19, k0, xmm15, one, false, Assembler::AVX_512bit);
+
+ __ evpaddd(xmm4, k0, xmm4, xmm16, false, Assembler::AVX_512bit);
+ __ evpaddd(xmm5, k0, xmm5, xmm17, false, Assembler::AVX_512bit);
+ __ evpaddd(xmm6, k0, xmm6, xmm18, false, Assembler::AVX_512bit);
+ __ evpaddd(xmm7, k0, xmm7, xmm19, false, Assembler::AVX_512bit);
+
+ // mask = (twoGamma2 / 2 - r0) >> 31;
+ __ evpsubd(xmm12, k0, gamma2, xmm8, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm13, k0, gamma2, xmm9, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm14, k0, gamma2, xmm10, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm15, k0, gamma2, xmm11, false, Assembler::AVX_512bit);
+
+ __ evpsrad(xmm12, k0, xmm12, 31, false, Assembler::AVX_512bit);
+ __ evpsrad(xmm13, k0, xmm13, 31, false, Assembler::AVX_512bit);
+ __ evpsrad(xmm14, k0, xmm14, 31, false, Assembler::AVX_512bit);
+ __ evpsrad(xmm15, k0, xmm15, 31, false, Assembler::AVX_512bit);
+
+ // r0 -= (mask & twoGamma2);
+ __ evpandd(xmm16, k0, xmm12, twoGamma2, false, Assembler::AVX_512bit);
+ __ evpandd(xmm17, k0, xmm13, twoGamma2, false, Assembler::AVX_512bit);
+ __ evpandd(xmm18, k0, xmm14, twoGamma2, false, Assembler::AVX_512bit);
+ __ evpandd(xmm19, k0, xmm15, twoGamma2, false, Assembler::AVX_512bit);
+
+ __ evpsubd(xmm8, k0, xmm8, xmm16, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm9, k0, xmm9, xmm17, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm10, k0, xmm10, xmm18, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm11, k0, xmm11, xmm19, false, Assembler::AVX_512bit);
+ // r0 in xmm8
+ // quotient += (mask & 1);
+ __ evpandd(xmm16, k0, xmm12, one, false, Assembler::AVX_512bit);
+ __ evpandd(xmm17, k0, xmm13, one, false, Assembler::AVX_512bit);
+ __ evpandd(xmm18, k0, xmm14, one, false, Assembler::AVX_512bit);
+ __ evpandd(xmm19, k0, xmm15, one, false, Assembler::AVX_512bit);
+
+ __ evpaddd(xmm4, k0, xmm4, xmm16, false, Assembler::AVX_512bit);
+ __ evpaddd(xmm5, k0, xmm5, xmm17, false, Assembler::AVX_512bit);
+ __ evpaddd(xmm6, k0, xmm6, xmm18, false, Assembler::AVX_512bit);
+ __ evpaddd(xmm7, k0, xmm7, xmm19, false, Assembler::AVX_512bit);
+ // quotient in xmm4
+ // int r1 = rplus - r0 - (dilithium_q - 1);
+ __ evpsubd(xmm16, k0, xmm0, xmm8, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm17, k0, xmm1, xmm9, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm18, k0, xmm2, xmm10, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm19, k0, xmm3, xmm11, false, Assembler::AVX_512bit);
+
+ __ evpsubd(xmm16, k0, xmm16, xmm26, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm17, k0, xmm17, xmm26, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm18, k0, xmm18, xmm26, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm19, k0, xmm19, xmm26, false, Assembler::AVX_512bit);
+ // r1 in xmm16
+ // r1 = (r1 | (-r1)) >> 31; // 0 if rplus - r0 == (dilithium_q - 1), -1 otherwise
+ __ evpsubd(xmm20, k0, zero, xmm16, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm21, k0, zero, xmm17, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm22, k0, zero, xmm18, false, Assembler::AVX_512bit);
+ __ evpsubd(xmm23, k0, zero, xmm19, false, Assembler::AVX_512bit);
+
+ __ evporq(xmm16, k0, xmm16, xmm20, false, Assembler::AVX_512bit);
+ __ evporq(xmm17, k0, xmm17, xmm21, false, Assembler::AVX_512bit);
+ __ evporq(xmm18, k0, xmm18, xmm22, false, Assembler::AVX_512bit);
+ __ evporq(xmm19, k0, xmm19, xmm23, false, Assembler::AVX_512bit);
+
+ __ evpsubd(xmm12, k0, zero, one, false, Assembler::AVX_512bit); // -1
+
+ __ evpsrad(xmm0, k0, xmm16, 31, false, Assembler::AVX_512bit);
+ __ evpsrad(xmm1, k0, xmm17, 31, false, Assembler::AVX_512bit);
+ __ evpsrad(xmm2, k0, xmm18, 31, false, Assembler::AVX_512bit);
+ __ evpsrad(xmm3, k0, xmm19, 31, false, Assembler::AVX_512bit);
+ // r1 in xmm0
+ // r0 += ~r1;
+ __ evpxorq(xmm20, k0, xmm0, xmm12, false, Assembler::AVX_512bit);
+ __ evpxorq(xmm21, k0, xmm1, xmm12, false, Assembler::AVX_512bit);
+ __ evpxorq(xmm22, k0, xmm2, xmm12, false, Assembler::AVX_512bit);
+ __ evpxorq(xmm23, k0, xmm3, xmm12, false, Assembler::AVX_512bit);
+
+ __ evpaddd(xmm8, k0, xmm8, xmm20, false, Assembler::AVX_512bit);
+ __ evpaddd(xmm9, k0, xmm9, xmm21, false, Assembler::AVX_512bit);
+ __ evpaddd(xmm10, k0, xmm10, xmm22, false, Assembler::AVX_512bit);
+ __ evpaddd(xmm11, k0, xmm11, xmm23, false, Assembler::AVX_512bit);
+ // r0 in xmm8
+ // r1 = r1 & quotient;
+ __ evpandd(xmm0, k0, xmm4, xmm0, false, Assembler::AVX_512bit);
+ __ evpandd(xmm1, k0, xmm5, xmm1, false, Assembler::AVX_512bit);
+ __ evpandd(xmm2, k0, xmm6, xmm2, false, Assembler::AVX_512bit);
+ __ evpandd(xmm3, k0, xmm7, xmm3, false, Assembler::AVX_512bit);
+ // r1 in xmm0
+ // lowPart[m] = r0;
+ // highPart[m] = r1;
+ store4Xmms(highPart, 0, xmm0_3, _masm);
+ store4Xmms(lowPart, 0, xmm8_11, _masm);
+
+ __ addptr(highPart, 4 * XMMBYTES);
+ __ addptr(lowPart, 4 * XMMBYTES);
+ __ subl(len, 4 * XMMBYTES);
+ __ jcc(Assembler::notEqual, L_loop);
+
+ __ leave(); // required for proper stackwalking of RuntimeStub frame
+ __ mov64(rax, 0); // return 0
+ __ ret(0);
+
+ return start;
+}
+
+void StubGenerator::generate_dilithium_stubs() {
+ // Generate Dilithium intrinsics code
+ if (UseDilithiumIntrinsics) {
+ StubRoutines::_dilithiumAlmostNtt =
+ generate_dilithiumAlmostNtt_avx512(this, _masm);
+ StubRoutines::_dilithiumAlmostInverseNtt =
+ generate_dilithiumAlmostInverseNtt_avx512(this, _masm);
+ StubRoutines::_dilithiumNttMult =
+ generate_dilithiumNttMult_avx512(this, _masm);
+ StubRoutines::_dilithiumMontMulByConstant =
+ generate_dilithiumMontMulByConstant_avx512(this, _masm);
+ StubRoutines::_dilithiumDecomposePoly =
+ generate_dilithiumDecomposePoly_avx512(this, _masm);
+ }
+}
diff --git a/src/hotspot/cpu/x86/stubGenerator_x86_64_poly_mont.cpp b/src/hotspot/cpu/x86/stubGenerator_x86_64_poly_mont.cpp
index a897fe2e6942d..d142414be5ec2 100644
--- a/src/hotspot/cpu/x86/stubGenerator_x86_64_poly_mont.cpp
+++ b/src/hotspot/cpu/x86/stubGenerator_x86_64_poly_mont.cpp
@@ -564,7 +564,16 @@ address StubGenerator::generate_intpoly_montgomeryMult_P256() {
address start = __ pc();
__ enter();
- if (EnableX86ECoreOpts && UseAVX > 1) {
+ if (VM_Version::supports_avx512ifma() && VM_Version::supports_avx512vlbw()) {
+ // Register Map
+ const Register aLimbs = c_rarg0; // rdi | rcx
+ const Register bLimbs = c_rarg1; // rsi | rdx
+ const Register rLimbs = c_rarg2; // rdx | r8
+ const Register tmp = r9;
+
+ montgomeryMultiply(aLimbs, bLimbs, rLimbs, tmp, _masm);
+ } else {
+ assert(VM_Version::supports_avxifma(), "Require AVX_IFMA support");
__ push(r12);
__ push(r13);
__ push(r14);
@@ -607,14 +616,6 @@ address StubGenerator::generate_intpoly_montgomeryMult_P256() {
__ pop(r14);
__ pop(r13);
__ pop(r12);
- } else {
- // Register Map
- const Register aLimbs = c_rarg0; // rdi | rcx
- const Register bLimbs = c_rarg1; // rsi | rdx
- const Register rLimbs = c_rarg2; // rdx | r8
- const Register tmp = r9;
-
- montgomeryMultiply(aLimbs, bLimbs, rLimbs, tmp, _masm);
}
__ leave();
diff --git a/src/hotspot/cpu/x86/stubGenerator_x86_64_sha3.cpp b/src/hotspot/cpu/x86/stubGenerator_x86_64_sha3.cpp
index 7d1051711f20c..9f13233f1d217 100644
--- a/src/hotspot/cpu/x86/stubGenerator_x86_64_sha3.cpp
+++ b/src/hotspot/cpu/x86/stubGenerator_x86_64_sha3.cpp
@@ -38,6 +38,8 @@
#define BIND(label) bind(label); BLOCK_COMMENT(#label ":")
+#define xmm(i) as_XMMRegister(i)
+
// Constants
ATTRIBUTE_ALIGNED(64) static const uint64_t round_consts_arr[24] = {
0x0000000000000001L, 0x0000000000008082L, 0x800000000000808AL,
@@ -79,13 +81,6 @@ static address permsAndRotsAddr() {
return (address) permsAndRots;
}
-void StubGenerator::generate_sha3_stubs() {
- if (UseSHA3Intrinsics) {
- StubRoutines::_sha3_implCompress = generate_sha3_implCompress(StubGenStubId::sha3_implCompress_id);
- StubRoutines::_sha3_implCompressMB = generate_sha3_implCompress(StubGenStubId::sha3_implCompressMB_id);
- }
-}
-
// Arguments:
//
// Inputs:
@@ -95,7 +90,9 @@ void StubGenerator::generate_sha3_stubs() {
// c_rarg3 - int offset
// c_rarg4 - int limit
//
-address StubGenerator::generate_sha3_implCompress(StubGenStubId stub_id) {
+static address generate_sha3_implCompress(StubGenStubId stub_id,
+ StubGenerator *stubgen,
+ MacroAssembler *_masm) {
bool multiBlock;
switch(stub_id) {
case sha3_implCompress_id:
@@ -109,7 +106,7 @@ address StubGenerator::generate_sha3_implCompress(StubGenStubId stub_id) {
}
__ align(CodeEntryAlignment);
- StubCodeMark mark(this, stub_id);
+ StubCodeMark mark(stubgen, stub_id);
address start = __ pc();
const Register buf = c_rarg0;
@@ -154,29 +151,16 @@ address StubGenerator::generate_sha3_implCompress(StubGenStubId stub_id) {
__ kshiftrwl(k1, k5, 4);
// load the state
- __ evmovdquq(xmm0, k5, Address(state, 0), false, Assembler::AVX_512bit);
- __ evmovdquq(xmm1, k5, Address(state, 40), false, Assembler::AVX_512bit);
- __ evmovdquq(xmm2, k5, Address(state, 80), false, Assembler::AVX_512bit);
- __ evmovdquq(xmm3, k5, Address(state, 120), false, Assembler::AVX_512bit);
- __ evmovdquq(xmm4, k5, Address(state, 160), false, Assembler::AVX_512bit);
+ for (int i = 0; i < 5; i++) {
+ __ evmovdquq(xmm(i), k5, Address(state, i * 40), false, Assembler::AVX_512bit);
+ }
// load the permutation and rotation constants
- __ evmovdquq(xmm17, Address(permsAndRots, 0), Assembler::AVX_512bit);
- __ evmovdquq(xmm18, Address(permsAndRots, 64), Assembler::AVX_512bit);
- __ evmovdquq(xmm19, Address(permsAndRots, 128), Assembler::AVX_512bit);
- __ evmovdquq(xmm20, Address(permsAndRots, 192), Assembler::AVX_512bit);
- __ evmovdquq(xmm21, Address(permsAndRots, 256), Assembler::AVX_512bit);
- __ evmovdquq(xmm22, Address(permsAndRots, 320), Assembler::AVX_512bit);
- __ evmovdquq(xmm23, Address(permsAndRots, 384), Assembler::AVX_512bit);
- __ evmovdquq(xmm24, Address(permsAndRots, 448), Assembler::AVX_512bit);
- __ evmovdquq(xmm25, Address(permsAndRots, 512), Assembler::AVX_512bit);
- __ evmovdquq(xmm26, Address(permsAndRots, 576), Assembler::AVX_512bit);
- __ evmovdquq(xmm27, Address(permsAndRots, 640), Assembler::AVX_512bit);
- __ evmovdquq(xmm28, Address(permsAndRots, 704), Assembler::AVX_512bit);
- __ evmovdquq(xmm29, Address(permsAndRots, 768), Assembler::AVX_512bit);
- __ evmovdquq(xmm30, Address(permsAndRots, 832), Assembler::AVX_512bit);
- __ evmovdquq(xmm31, Address(permsAndRots, 896), Assembler::AVX_512bit);
+ for (int i = 0; i < 15; i++) {
+ __ evmovdquq(xmm(i + 17), Address(permsAndRots, i * 64), Assembler::AVX_512bit);
+ }
+ __ align(OptoLoopAlignment);
__ BIND(sha3_loop);
// there will be 24 keccak rounds
@@ -231,6 +215,7 @@ address StubGenerator::generate_sha3_implCompress(StubGenStubId stub_id) {
// The implementation closely follows the Java version, with the state
// array "rows" in the lowest 5 64-bit slots of zmm0 - zmm4, i.e.
// each row of the SHA3 specification is located in one zmm register.
+ __ align(OptoLoopAlignment);
__ BIND(rounds24_loop);
__ subl(roundsLeft, 1);
@@ -257,7 +242,7 @@ address StubGenerator::generate_sha3_implCompress(StubGenStubId stub_id) {
// Do the cyclical permutation of the 24 moving state elements
// and the required rotations within each element (the combined
- // rho and sigma steps).
+ // rho and pi steps).
__ evpermt2q(xmm4, xmm17, xmm3, Assembler::AVX_512bit);
__ evpermt2q(xmm3, xmm18, xmm2, Assembler::AVX_512bit);
__ evpermt2q(xmm2, xmm17, xmm1, Assembler::AVX_512bit);
@@ -279,7 +264,7 @@ address StubGenerator::generate_sha3_implCompress(StubGenStubId stub_id) {
__ evpermt2q(xmm2, xmm24, xmm4, Assembler::AVX_512bit);
__ evpermt2q(xmm3, xmm25, xmm4, Assembler::AVX_512bit);
__ evpermt2q(xmm4, xmm26, xmm5, Assembler::AVX_512bit);
- // The combined rho and sigma steps are done.
+ // The combined rho and pi steps are done.
// Do the chi step (the same operation on all 5 rows).
// vpternlogq(x, 180, y, z) does x = x ^ (y & ~z).
@@ -320,11 +305,9 @@ address StubGenerator::generate_sha3_implCompress(StubGenStubId stub_id) {
}
// store the state
- __ evmovdquq(Address(state, 0), k5, xmm0, true, Assembler::AVX_512bit);
- __ evmovdquq(Address(state, 40), k5, xmm1, true, Assembler::AVX_512bit);
- __ evmovdquq(Address(state, 80), k5, xmm2, true, Assembler::AVX_512bit);
- __ evmovdquq(Address(state, 120), k5, xmm3, true, Assembler::AVX_512bit);
- __ evmovdquq(Address(state, 160), k5, xmm4, true, Assembler::AVX_512bit);
+ for (int i = 0; i < 5; i++) {
+ __ evmovdquq(Address(state, i * 40), k5, xmm(i), true, Assembler::AVX_512bit);
+ }
__ pop(r14);
__ pop(r13);
@@ -335,3 +318,193 @@ address StubGenerator::generate_sha3_implCompress(StubGenStubId stub_id) {
return start;
}
+
+// Inputs:
+// c_rarg0 - long[] state0
+// c_rarg1 - long[] state1
+//
+// Performs two keccak() computations in parallel. The steps of the
+// two computations are executed interleaved.
+static address generate_double_keccak(StubGenerator *stubgen, MacroAssembler *_masm) {
+ __ align(CodeEntryAlignment);
+ StubGenStubId stub_id = double_keccak_id;
+ StubCodeMark mark(stubgen, stub_id);
+ address start = __ pc();
+
+ const Register state0 = c_rarg0;
+ const Register state1 = c_rarg1;
+
+ const Register permsAndRots = c_rarg2;
+ const Register round_consts = c_rarg3;
+ const Register constant2use = r10;
+ const Register roundsLeft = r11;
+
+ Label rounds24_loop;
+
+ __ enter();
+
+ __ lea(permsAndRots, ExternalAddress(permsAndRotsAddr()));
+ __ lea(round_consts, ExternalAddress(round_constsAddr()));
+
+ // set up the masks
+ __ movl(rax, 0x1F);
+ __ kmovwl(k5, rax);
+ __ kshiftrwl(k4, k5, 1);
+ __ kshiftrwl(k3, k5, 2);
+ __ kshiftrwl(k2, k5, 3);
+ __ kshiftrwl(k1, k5, 4);
+
+ // load the states
+ for (int i = 0; i < 5; i++) {
+ __ evmovdquq(xmm(i), k5, Address(state0, i * 40), false, Assembler::AVX_512bit);
+ }
+ for (int i = 0; i < 5; i++) {
+ __ evmovdquq(xmm(10 + i), k5, Address(state1, i * 40), false, Assembler::AVX_512bit);
+ }
+
+ // load the permutation and rotation constants
+
+ for (int i = 0; i < 15; i++) {
+ __ evmovdquq(xmm(17 + i), Address(permsAndRots, i * 64), Assembler::AVX_512bit);
+ }
+
+ // there will be 24 keccak rounds
+ // The same operations as the ones in generate_sha3_implCompress are
+ // performed, but in parallel for two states: one in regs z0-z5, using z6
+ // as the scratch register and the other in z10-z15, using z16 as the
+ // scratch register.
+ // The permutation and rotation constants, that are loaded into z17-z31,
+ // are shared between the two computations.
+ __ movl(roundsLeft, 24);
+ // load round_constants base
+ __ movptr(constant2use, round_consts);
+
+ __ align(OptoLoopAlignment);
+ __ BIND(rounds24_loop);
+ __ subl( roundsLeft, 1);
+
+ __ evmovdquw(xmm5, xmm0, Assembler::AVX_512bit);
+ __ evmovdquw(xmm15, xmm10, Assembler::AVX_512bit);
+ __ vpternlogq(xmm5, 150, xmm1, xmm2, Assembler::AVX_512bit);
+ __ vpternlogq(xmm15, 150, xmm11, xmm12, Assembler::AVX_512bit);
+ __ vpternlogq(xmm5, 150, xmm3, xmm4, Assembler::AVX_512bit);
+ __ vpternlogq(xmm15, 150, xmm13, xmm14, Assembler::AVX_512bit);
+ __ evprolq(xmm6, xmm5, 1, Assembler::AVX_512bit);
+ __ evprolq(xmm16, xmm15, 1, Assembler::AVX_512bit);
+ __ evpermt2q(xmm5, xmm30, xmm5, Assembler::AVX_512bit);
+ __ evpermt2q(xmm15, xmm30, xmm15, Assembler::AVX_512bit);
+ __ evpermt2q(xmm6, xmm31, xmm6, Assembler::AVX_512bit);
+ __ evpermt2q(xmm16, xmm31, xmm16, Assembler::AVX_512bit);
+ __ vpternlogq(xmm0, 150, xmm5, xmm6, Assembler::AVX_512bit);
+ __ vpternlogq(xmm10, 150, xmm15, xmm16, Assembler::AVX_512bit);
+ __ vpternlogq(xmm1, 150, xmm5, xmm6, Assembler::AVX_512bit);
+ __ vpternlogq(xmm11, 150, xmm15, xmm16, Assembler::AVX_512bit);
+ __ vpternlogq(xmm2, 150, xmm5, xmm6, Assembler::AVX_512bit);
+ __ vpternlogq(xmm12, 150, xmm15, xmm16, Assembler::AVX_512bit);
+ __ vpternlogq(xmm3, 150, xmm5, xmm6, Assembler::AVX_512bit);
+ __ vpternlogq(xmm13, 150, xmm15, xmm16, Assembler::AVX_512bit);
+ __ vpternlogq(xmm4, 150, xmm5, xmm6, Assembler::AVX_512bit);
+ __ vpternlogq(xmm14, 150, xmm15, xmm16, Assembler::AVX_512bit);
+ __ evpermt2q(xmm4, xmm17, xmm3, Assembler::AVX_512bit);
+ __ evpermt2q(xmm14, xmm17, xmm13, Assembler::AVX_512bit);
+ __ evpermt2q(xmm3, xmm18, xmm2, Assembler::AVX_512bit);
+ __ evpermt2q(xmm13, xmm18, xmm12, Assembler::AVX_512bit);
+ __ evpermt2q(xmm2, xmm17, xmm1, Assembler::AVX_512bit);
+ __ evpermt2q(xmm12, xmm17, xmm11, Assembler::AVX_512bit);
+ __ evpermt2q(xmm1, xmm19, xmm0, Assembler::AVX_512bit);
+ __ evpermt2q(xmm11, xmm19, xmm10, Assembler::AVX_512bit);
+ __ evpermt2q(xmm4, xmm20, xmm2, Assembler::AVX_512bit);
+ __ evpermt2q(xmm14, xmm20, xmm12, Assembler::AVX_512bit);
+ __ evprolvq(xmm1, xmm1, xmm27, Assembler::AVX_512bit);
+ __ evprolvq(xmm11, xmm11, xmm27, Assembler::AVX_512bit);
+ __ evprolvq(xmm3, xmm3, xmm28, Assembler::AVX_512bit);
+ __ evprolvq(xmm13, xmm13, xmm28, Assembler::AVX_512bit);
+ __ evprolvq(xmm4, xmm4, xmm29, Assembler::AVX_512bit);
+ __ evprolvq(xmm14, xmm14, xmm29, Assembler::AVX_512bit);
+ __ evmovdquw(xmm2, xmm1, Assembler::AVX_512bit);
+ __ evmovdquw(xmm12, xmm11, Assembler::AVX_512bit);
+ __ evmovdquw(xmm5, xmm3, Assembler::AVX_512bit);
+ __ evmovdquw(xmm15, xmm13, Assembler::AVX_512bit);
+ __ evpermt2q(xmm0, xmm21, xmm4, Assembler::AVX_512bit);
+ __ evpermt2q(xmm10, xmm21, xmm14, Assembler::AVX_512bit);
+ __ evpermt2q(xmm1, xmm22, xmm3, Assembler::AVX_512bit);
+ __ evpermt2q(xmm11, xmm22, xmm13, Assembler::AVX_512bit);
+ __ evpermt2q(xmm5, xmm22, xmm2, Assembler::AVX_512bit);
+ __ evpermt2q(xmm15, xmm22, xmm12, Assembler::AVX_512bit);
+ __ evmovdquw(xmm3, xmm1, Assembler::AVX_512bit);
+ __ evmovdquw(xmm13, xmm11, Assembler::AVX_512bit);
+ __ evmovdquw(xmm2, xmm5, Assembler::AVX_512bit);
+ __ evmovdquw(xmm12, xmm15, Assembler::AVX_512bit);
+ __ evpermt2q(xmm1, xmm23, xmm4, Assembler::AVX_512bit);
+ __ evpermt2q(xmm11, xmm23, xmm14, Assembler::AVX_512bit);
+ __ evpermt2q(xmm2, xmm24, xmm4, Assembler::AVX_512bit);
+ __ evpermt2q(xmm12, xmm24, xmm14, Assembler::AVX_512bit);
+ __ evpermt2q(xmm3, xmm25, xmm4, Assembler::AVX_512bit);
+ __ evpermt2q(xmm13, xmm25, xmm14, Assembler::AVX_512bit);
+ __ evpermt2q(xmm4, xmm26, xmm5, Assembler::AVX_512bit);
+ __ evpermt2q(xmm14, xmm26, xmm15, Assembler::AVX_512bit);
+
+ __ evpermt2q(xmm5, xmm31, xmm0, Assembler::AVX_512bit);
+ __ evpermt2q(xmm15, xmm31, xmm10, Assembler::AVX_512bit);
+ __ evpermt2q(xmm6, xmm31, xmm5, Assembler::AVX_512bit);
+ __ evpermt2q(xmm16, xmm31, xmm15, Assembler::AVX_512bit);
+ __ vpternlogq(xmm0, 180, xmm6, xmm5, Assembler::AVX_512bit);
+ __ vpternlogq(xmm10, 180, xmm16, xmm15, Assembler::AVX_512bit);
+
+ __ evpermt2q(xmm5, xmm31, xmm1, Assembler::AVX_512bit);
+ __ evpermt2q(xmm15, xmm31, xmm11, Assembler::AVX_512bit);
+ __ evpermt2q(xmm6, xmm31, xmm5, Assembler::AVX_512bit);
+ __ evpermt2q(xmm16, xmm31, xmm15, Assembler::AVX_512bit);
+ __ vpternlogq(xmm1, 180, xmm6, xmm5, Assembler::AVX_512bit);
+ __ vpternlogq(xmm11, 180, xmm16, xmm15, Assembler::AVX_512bit);
+
+ __ evpxorq(xmm0, k1, xmm0, Address(constant2use, 0), true, Assembler::AVX_512bit);
+ __ evpxorq(xmm10, k1, xmm10, Address(constant2use, 0), true, Assembler::AVX_512bit);
+ __ addptr(constant2use, 8);
+
+ __ evpermt2q(xmm5, xmm31, xmm2, Assembler::AVX_512bit);
+ __ evpermt2q(xmm15, xmm31, xmm12, Assembler::AVX_512bit);
+ __ evpermt2q(xmm6, xmm31, xmm5, Assembler::AVX_512bit);
+ __ evpermt2q(xmm16, xmm31, xmm15, Assembler::AVX_512bit);
+ __ vpternlogq(xmm2, 180, xmm6, xmm5, Assembler::AVX_512bit);
+ __ vpternlogq(xmm12, 180, xmm16, xmm15, Assembler::AVX_512bit);
+
+ __ evpermt2q(xmm5, xmm31, xmm3, Assembler::AVX_512bit);
+ __ evpermt2q(xmm15, xmm31, xmm13, Assembler::AVX_512bit);
+ __ evpermt2q(xmm6, xmm31, xmm5, Assembler::AVX_512bit);
+ __ evpermt2q(xmm16, xmm31, xmm15, Assembler::AVX_512bit);
+ __ vpternlogq(xmm3, 180, xmm6, xmm5, Assembler::AVX_512bit);
+ __ vpternlogq(xmm13, 180, xmm16, xmm15, Assembler::AVX_512bit);
+ __ evpermt2q(xmm5, xmm31, xmm4, Assembler::AVX_512bit);
+ __ evpermt2q(xmm15, xmm31, xmm14, Assembler::AVX_512bit);
+ __ evpermt2q(xmm6, xmm31, xmm5, Assembler::AVX_512bit);
+ __ evpermt2q(xmm16, xmm31, xmm15, Assembler::AVX_512bit);
+ __ vpternlogq(xmm4, 180, xmm6, xmm5, Assembler::AVX_512bit);
+ __ vpternlogq(xmm14, 180, xmm16, xmm15, Assembler::AVX_512bit);
+ __ cmpl(roundsLeft, 0);
+ __ jcc(Assembler::notEqual, rounds24_loop);
+
+ // store the states
+ for (int i = 0; i < 5; i++) {
+ __ evmovdquq(Address(state0, i * 40), k5, xmm(i), true, Assembler::AVX_512bit);
+ }
+ for (int i = 0; i < 5; i++) {
+ __ evmovdquq(Address(state1, i * 40), k5, xmm(10 + i), true, Assembler::AVX_512bit);
+ }
+
+ __ leave(); // required for proper stackwalking of RuntimeStub frame
+ __ ret(0);
+
+ return start;
+}
+
+void StubGenerator::generate_sha3_stubs() {
+ if (UseSHA3Intrinsics) {
+ StubRoutines::_sha3_implCompress =
+ generate_sha3_implCompress(StubGenStubId::sha3_implCompress_id, this, _masm);
+ StubRoutines::_double_keccak =
+ generate_double_keccak(this, _masm);
+ StubRoutines::_sha3_implCompressMB =
+ generate_sha3_implCompress(StubGenStubId::sha3_implCompressMB_id, this, _masm);
+ }
+}
diff --git a/src/hotspot/cpu/x86/stubGenerator_x86_64_tanh.cpp b/src/hotspot/cpu/x86/stubGenerator_x86_64_tanh.cpp
index d13809bfcd911..52ce2731b1fde 100644
--- a/src/hotspot/cpu/x86/stubGenerator_x86_64_tanh.cpp
+++ b/src/hotspot/cpu/x86/stubGenerator_x86_64_tanh.cpp
@@ -1,5 +1,5 @@
/*
-* Copyright (c) 2024, Intel Corporation. All rights reserved.
+* Copyright (c) 2024, 2025, Intel Corporation. All rights reserved.
* Intel Math Library (LIBM) Source Code
*
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
@@ -46,7 +46,7 @@
// for |x| in [23/64,3*2^7)
// e^{-2*|x|}=2^{-k-f}*2^{-r} ~ 2^{-k}*(Tn+Dn)*(1+p)=(T0+D0)*(1+p)
//
-// For |x| in [2^{-4},2^5):
+// For |x| in [2^{-4},22):
// 2^{-r}-1 ~ p=c1*r+c2*r^2+..+c5*r^5
// Let R=1/(1+T0+p*T0), truncated to 35 significant bits
// R=1/(1+T0+D0+p*(T0+D0))*(1+eps), |eps|<2^{-33}
@@ -66,11 +66,11 @@
//
// For |x|<2^{-64}: x is returned
//
-// For |x|>=2^32: return +/-1
+// For |x|>=22: return +/-1
//
// Special cases:
// tanh(NaN) = quiet NaN, and raise invalid exception
-// tanh(INF) = that INF
+// tanh(+/-INF) = +/-1
// tanh(+/-0) = +/-0
//
/******************************************************************************/
@@ -324,6 +324,12 @@ address StubGenerator::generate_libmTanh() {
__ enter(); // required for proper stackwalking of RuntimeStub frame
__ bind(B1_2);
+ __ pextrw(rcx, xmm0, 3);
+ __ movl(rdx, 32768);
+ __ andl(rdx, rcx);
+ __ andl(rcx, 32767);
+ __ cmpl(rcx, 16438);
+ __ jcc(Assembler::aboveEqual, L_2TAG_PACKET_2_0_1); // Branch only if |x| >= 22
__ movsd(xmm3, ExternalAddress(HALFMASK), r11 /*rscratch*/);
__ xorpd(xmm4, xmm4);
__ movsd(xmm1, ExternalAddress(L2E), r11 /*rscratch*/);
@@ -331,16 +337,12 @@ address StubGenerator::generate_libmTanh() {
__ movl(rax, 32768);
__ pinsrw(xmm4, rax, 3);
__ movsd(xmm6, ExternalAddress(Shifter), r11 /*rscratch*/);
- __ pextrw(rcx, xmm0, 3);
__ andpd(xmm3, xmm0);
__ andnpd(xmm4, xmm0);
__ pshufd(xmm5, xmm4, 68);
- __ movl(rdx, 32768);
- __ andl(rdx, rcx);
- __ andl(rcx, 32767);
__ subl(rcx, 16304);
- __ cmpl(rcx, 144);
- __ jcc(Assembler::aboveEqual, L_2TAG_PACKET_0_0_1);
+ __ cmpl(rcx, 134);
+ __ jcc(Assembler::aboveEqual, L_2TAG_PACKET_0_0_1); // Branch only if |x| is not in [2^{-4},22)
__ subsd(xmm4, xmm3);
__ mulsd(xmm3, xmm1);
__ mulsd(xmm2, xmm5);
@@ -427,8 +429,8 @@ address StubGenerator::generate_libmTanh() {
__ bind(L_2TAG_PACKET_0_0_1);
__ addl(rcx, 960);
- __ cmpl(rcx, 1104);
- __ jcc(Assembler::aboveEqual, L_2TAG_PACKET_1_0_1);
+ __ cmpl(rcx, 1094);
+ __ jcc(Assembler::aboveEqual, L_2TAG_PACKET_1_0_1); // Branch only if |x| not in [2^{-64}, 2^{-4})
__ movdqu(xmm2, ExternalAddress(pv), r11 /*rscratch*/);
__ pshufd(xmm1, xmm0, 68);
__ movdqu(xmm3, ExternalAddress(pv + 16), r11 /*rscratch*/);
@@ -449,11 +451,8 @@ address StubGenerator::generate_libmTanh() {
__ jmp(B1_4);
__ bind(L_2TAG_PACKET_1_0_1);
- __ addl(rcx, 15344);
- __ cmpl(rcx, 16448);
- __ jcc(Assembler::aboveEqual, L_2TAG_PACKET_2_0_1);
__ cmpl(rcx, 16);
- __ jcc(Assembler::below, L_2TAG_PACKET_3_0_1);
+ __ jcc(Assembler::below, L_2TAG_PACKET_3_0_1); // Branch only if |x| is denormalized
__ xorpd(xmm2, xmm2);
__ movl(rax, 17392);
__ pinsrw(xmm2, rax, 3);
@@ -468,7 +467,7 @@ address StubGenerator::generate_libmTanh() {
__ bind(L_2TAG_PACKET_2_0_1);
__ cmpl(rcx, 32752);
- __ jcc(Assembler::aboveEqual, L_2TAG_PACKET_4_0_1);
+ __ jcc(Assembler::aboveEqual, L_2TAG_PACKET_4_0_1); // Branch only if |x| is INF or NaN
__ xorpd(xmm2, xmm2);
__ movl(rcx, 15344);
__ pinsrw(xmm2, rcx, 3);
@@ -489,7 +488,7 @@ address StubGenerator::generate_libmTanh() {
__ movdl(rcx, xmm2);
__ orl(rcx, rax);
__ cmpl(rcx, 0);
- __ jcc(Assembler::equal, L_2TAG_PACKET_5_0_1);
+ __ jcc(Assembler::equal, L_2TAG_PACKET_5_0_1); // Branch only if |x| is not NaN
__ addsd(xmm0, xmm0);
__ bind(B1_4);
diff --git a/src/hotspot/cpu/x86/stubRoutines_x86.cpp b/src/hotspot/cpu/x86/stubRoutines_x86.cpp
index 861c1e1216e3b..9b524ae94cf45 100644
--- a/src/hotspot/cpu/x86/stubRoutines_x86.cpp
+++ b/src/hotspot/cpu/x86/stubRoutines_x86.cpp
@@ -46,10 +46,8 @@ STUBGEN_ARCH_ENTRIES_DO(DEFINE_ARCH_ENTRY, DEFINE_ARCH_ENTRY_INIT)
#undef DEFINE_ARCH_ENTRY
address StubRoutines::x86::_k256_adr = nullptr;
-#ifdef _LP64
address StubRoutines::x86::_k256_W_adr = nullptr;
address StubRoutines::x86::_k512_W_addr = nullptr;
-#endif
const uint64_t StubRoutines::x86::_crc_by128_masks[] =
{
@@ -146,7 +144,6 @@ const juint StubRoutines::x86::_crc_table[] =
0x2d02ef8dUL
};
-#ifdef _LP64
const juint StubRoutines::x86::_crc_table_avx512[] =
{
0xe95c1271UL, 0x00000000UL, 0xce3371cbUL, 0x00000000UL,
@@ -193,7 +190,6 @@ const juint StubRoutines::x86::_shuf_table_crc32_avx512[] =
0x83828100UL, 0x87868584UL, 0x8b8a8988UL, 0x8f8e8d8cUL,
0x03020100UL, 0x07060504UL, 0x0b0a0908UL, 0x000e0d0cUL
};
-#endif // _LP64
const jint StubRoutines::x86::_arrays_hashcode_powers_of_31[] =
{
@@ -356,7 +352,6 @@ ATTRIBUTE_ALIGNED(64) const juint StubRoutines::x86::_k256[] =
0x90befffaUL, 0xa4506cebUL, 0xbef9a3f7UL, 0xc67178f2UL
};
-#ifdef _LP64
// used in MacroAssembler::sha256_AVX2
// dynamically built from _k256
ATTRIBUTE_ALIGNED(64) juint StubRoutines::x86::_k256_W[2*sizeof(StubRoutines::x86::_k256)];
@@ -405,4 +400,3 @@ ATTRIBUTE_ALIGNED(64) const julong StubRoutines::x86::_k512_W[] =
0x4cc5d4becb3e42b6ULL, 0x597f299cfc657e2aULL,
0x5fcb6fab3ad6faecULL, 0x6c44198c4a475817ULL,
};
-#endif
diff --git a/src/hotspot/cpu/x86/stubRoutines_x86.hpp b/src/hotspot/cpu/x86/stubRoutines_x86.hpp
index aaf84eb843777..c4930e1593c47 100644
--- a/src/hotspot/cpu/x86/stubRoutines_x86.hpp
+++ b/src/hotspot/cpu/x86/stubRoutines_x86.hpp
@@ -75,39 +75,16 @@ class x86 {
#undef DEFINE_ARCH_ENTRY_GETTER_INIT
#undef DEFINE_ARCH_GETTER_ENTRY
-
-#ifndef _LP64
-
- static jint _fpu_cntrl_wrd_std;
- static jint _fpu_cntrl_wrd_24;
- static jint _fpu_cntrl_wrd_trunc;
-
- static jint _fpu_subnormal_bias1[3];
- static jint _fpu_subnormal_bias2[3];
-
- static address addr_fpu_cntrl_wrd_std() { return (address)&_fpu_cntrl_wrd_std; }
- static address addr_fpu_cntrl_wrd_24() { return (address)&_fpu_cntrl_wrd_24; }
- static address addr_fpu_cntrl_wrd_trunc() { return (address)&_fpu_cntrl_wrd_trunc; }
- static address addr_fpu_subnormal_bias1() { return (address)&_fpu_subnormal_bias1; }
- static address addr_fpu_subnormal_bias2() { return (address)&_fpu_subnormal_bias2; }
-
- static jint fpu_cntrl_wrd_std() { return _fpu_cntrl_wrd_std; }
-#endif // !LP64
-
private:
static jint _mxcsr_std;
-#ifdef _LP64
static jint _mxcsr_rz;
-#endif // _LP64
// masks and table for CRC32
static const uint64_t _crc_by128_masks[];
static const juint _crc_table[];
-#ifdef _LP64
static const juint _crc_by128_masks_avx512[];
static const juint _crc_table_avx512[];
static const juint _crc32c_table_avx512[];
static const juint _shuf_table_crc32_avx512[];
-#endif // _LP64
// table for CRC32C
static juint* _crc32c_table;
// table for arrays_hashcode
@@ -115,30 +92,22 @@ class x86 {
//k256 table for sha256
static const juint _k256[];
static address _k256_adr;
-#ifdef _LP64
static juint _k256_W[];
static address _k256_W_adr;
static const julong _k512_W[];
static address _k512_W_addr;
-#endif
public:
static address addr_mxcsr_std() { return (address)&_mxcsr_std; }
-#ifdef _LP64
static address addr_mxcsr_rz() { return (address)&_mxcsr_rz; }
-#endif // _LP64
static address crc_by128_masks_addr() { return (address)_crc_by128_masks; }
-#ifdef _LP64
static address crc_by128_masks_avx512_addr() { return (address)_crc_by128_masks_avx512; }
static address shuf_table_crc32_avx512_addr() { return (address)_shuf_table_crc32_avx512; }
static address crc_table_avx512_addr() { return (address)_crc_table_avx512; }
static address crc32c_table_avx512_addr() { return (address)_crc32c_table_avx512; }
-#endif // _LP64
static address k256_addr() { return _k256_adr; }
-#ifdef _LP64
static address k256_W_addr() { return _k256_W_adr; }
static address k512_W_addr() { return _k512_W_addr; }
-#endif
static address arrays_hashcode_powers_of_31() { return (address)_arrays_hashcode_powers_of_31; }
static void generate_CRC32C_table(bool is_pclmulqdq_supported);
diff --git a/src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp b/src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp
index efbdac8244dfc..45e30a8b4fb52 100644
--- a/src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp
+++ b/src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp
@@ -204,10 +204,10 @@ address TemplateInterpreterGenerator::generate_return_entry_for(TosState state,
}
if (JvmtiExport::can_pop_frame()) {
- __ check_and_handle_popframe(r15_thread);
+ __ check_and_handle_popframe();
}
if (JvmtiExport::can_force_early_return()) {
- __ check_and_handle_earlyret(r15_thread);
+ __ check_and_handle_earlyret();
}
__ dispatch_next(state, step);
@@ -654,7 +654,7 @@ address TemplateInterpreterGenerator::generate_Reference_get_entry(void) {
// Load the value of the referent field.
const Address field_address(rax, referent_offset);
- __ load_heap_oop(rax, field_address, /*tmp1*/ rbx, /*tmp_thread*/ rdx, ON_WEAK_OOP_REF);
+ __ load_heap_oop(rax, field_address, /*tmp1*/ rbx, ON_WEAK_OOP_REF);
// _areturn
__ pop(rdi); // get return address
@@ -991,7 +991,7 @@ address TemplateInterpreterGenerator::generate_native_entry(bool synchronized) {
Label Continue;
Label slow_path;
- __ safepoint_poll(slow_path, thread, true /* at_return */, false /* in_nmethod */);
+ __ safepoint_poll(slow_path, true /* at_return */, false /* in_nmethod */);
__ cmpl(Address(thread, JavaThread::suspend_flags_offset()), 0);
__ jcc(Assembler::equal, Continue);
@@ -1034,7 +1034,7 @@ address TemplateInterpreterGenerator::generate_native_entry(bool synchronized) {
}
// reset_last_Java_frame
- __ reset_last_Java_frame(thread, true);
+ __ reset_last_Java_frame(true);
if (CheckJNICalls) {
// clear_pending_jni_exception_check
@@ -1057,7 +1057,6 @@ address TemplateInterpreterGenerator::generate_native_entry(bool synchronized) {
__ pop(ltos);
// Unbox oop result, e.g. JNIHandles::resolve value.
__ resolve_jobject(rax /* value */,
- thread /* thread */,
t /* tmp */);
__ movptr(Address(rbp, frame::interpreter_frame_oop_temp_offset*wordSize), rax);
// keep stack depth as expected by pushing oop which will eventually be discarded
@@ -1495,7 +1494,7 @@ void TemplateInterpreterGenerator::generate_throw_exception() {
// PC must point into interpreter here
__ set_last_Java_frame(noreg, rbp, __ pc(), rscratch1);
__ super_call_VM_leaf(CAST_FROM_FN_PTR(address, InterpreterRuntime::popframe_move_outgoing_args), r15_thread, c_rarg1, c_rarg2);
- __ reset_last_Java_frame(thread, true);
+ __ reset_last_Java_frame(true);
// Restore the last_sp and null it out
__ movptr(rcx, Address(rbp, frame::interpreter_frame_last_sp_offset * wordSize));
@@ -1544,11 +1543,11 @@ void TemplateInterpreterGenerator::generate_throw_exception() {
// preserve exception over this code sequence
__ pop_ptr(rax);
- __ movptr(Address(thread, JavaThread::vm_result_offset()), rax);
+ __ movptr(Address(thread, JavaThread::vm_result_oop_offset()), rax);
// remove the activation (without doing throws on illegalMonitorExceptions)
__ remove_activation(vtos, rdx, false, true, false);
// restore exception
- __ get_vm_result(rax, thread);
+ __ get_vm_result_oop(rax);
// In between activations - previous activation type unknown yet
// compute continuation point - the continuation point expects the
diff --git a/src/hotspot/cpu/x86/templateInterpreterGenerator_x86_64.cpp b/src/hotspot/cpu/x86/templateInterpreterGenerator_x86_64.cpp
index f4ed1081c3b6a..6be702f2699a6 100644
--- a/src/hotspot/cpu/x86/templateInterpreterGenerator_x86_64.cpp
+++ b/src/hotspot/cpu/x86/templateInterpreterGenerator_x86_64.cpp
@@ -190,7 +190,7 @@ address TemplateInterpreterGenerator::generate_CRC32_update_entry() {
// c_rarg1: scratch (rsi on non-Win64, rdx on Win64)
Label slow_path;
- __ safepoint_poll(slow_path, r15_thread, true /* at_return */, false /* in_nmethod */);
+ __ safepoint_poll(slow_path, true /* at_return */, false /* in_nmethod */);
// We don't generate local frame and don't align stack because
// we call stub code and there is no safepoint on this path.
@@ -234,7 +234,7 @@ address TemplateInterpreterGenerator::generate_CRC32_updateBytes_entry(AbstractI
// r13: senderSP must preserved for slow path, set SP to it on fast path
Label slow_path;
- __ safepoint_poll(slow_path, r15_thread, false /* at_return */, false /* in_nmethod */);
+ __ safepoint_poll(slow_path, false /* at_return */, false /* in_nmethod */);
// We don't generate local frame and don't align stack because
// we call stub code and there is no safepoint on this path.
diff --git a/src/hotspot/cpu/x86/templateTable_x86.cpp b/src/hotspot/cpu/x86/templateTable_x86.cpp
index b378bc431fad5..43da80f408261 100644
--- a/src/hotspot/cpu/x86/templateTable_x86.cpp
+++ b/src/hotspot/cpu/x86/templateTable_x86.cpp
@@ -151,7 +151,7 @@ static void do_oop_load(InterpreterMacroAssembler* _masm,
Address src,
Register dst,
DecoratorSet decorators = 0) {
- __ load_heap_oop(dst, src, rdx, rbx, decorators);
+ __ load_heap_oop(dst, src, rdx, decorators);
}
Address TemplateTable::at_bcp(int offset) {
@@ -276,44 +276,36 @@ void TemplateTable::lconst(int value) {
void TemplateTable::fconst(int value) {
transition(vtos, ftos);
- if (UseSSE >= 1) {
- static float one = 1.0f, two = 2.0f;
- switch (value) {
- case 0:
- __ xorps(xmm0, xmm0);
- break;
- case 1:
- __ movflt(xmm0, ExternalAddress((address) &one), rscratch1);
- break;
- case 2:
- __ movflt(xmm0, ExternalAddress((address) &two), rscratch1);
- break;
- default:
- ShouldNotReachHere();
- break;
- }
- } else {
+ static float one = 1.0f, two = 2.0f;
+ switch (value) {
+ case 0:
+ __ xorps(xmm0, xmm0);
+ break;
+ case 1:
+ __ movflt(xmm0, ExternalAddress((address) &one), rscratch1);
+ break;
+ case 2:
+ __ movflt(xmm0, ExternalAddress((address) &two), rscratch1);
+ break;
+ default:
ShouldNotReachHere();
+ break;
}
}
void TemplateTable::dconst(int value) {
transition(vtos, dtos);
- if (UseSSE >= 2) {
- static double one = 1.0;
- switch (value) {
- case 0:
- __ xorpd(xmm0, xmm0);
- break;
- case 1:
- __ movdbl(xmm0, ExternalAddress((address) &one), rscratch1);
- break;
- default:
- ShouldNotReachHere();
- break;
- }
- } else {
+ static double one = 1.0;
+ switch (value) {
+ case 0:
+ __ xorpd(xmm0, xmm0);
+ break;
+ case 1:
+ __ movdbl(xmm0, ExternalAddress((address) &one), rscratch1);
+ break;
+ default:
ShouldNotReachHere();
+ break;
}
}
@@ -373,7 +365,7 @@ void TemplateTable::ldc(LdcType type) {
__ jccb(Assembler::notEqual, notFloat);
// ftos
- __ load_float(Address(rcx, rbx, Address::times_ptr, base_offset));
+ __ movflt(xmm0, Address(rcx, rbx, Address::times_ptr, base_offset));
__ push(ftos);
__ jmp(Done);
@@ -452,7 +444,7 @@ void TemplateTable::ldc2_w() {
__ jccb(Assembler::notEqual, notDouble);
// dtos
- __ load_double(Address(rcx, rbx, Address::times_ptr, base_offset));
+ __ movdbl(xmm0, Address(rcx, rbx, Address::times_ptr, base_offset));
__ push(dtos);
__ jmp(Done);
@@ -478,7 +470,7 @@ void TemplateTable::condy_helper(Label& Done) {
const Register rarg = c_rarg1;
__ movl(rarg, (int)bytecode());
call_VM(obj, CAST_FROM_FN_PTR(address, InterpreterRuntime::resolve_ldc), rarg);
- __ get_vm_result_2(flags, r15_thread);
+ __ get_vm_result_metadata(flags);
// VMr = obj = base address to find primitive value to push
// VMr2 = flags = (tos, off) using format of CPCE::_flags
__ movl(off, flags);
@@ -506,7 +498,7 @@ void TemplateTable::condy_helper(Label& Done) {
__ cmpl(flags, ftos);
__ jccb(Assembler::notEqual, notFloat);
// ftos
- __ load_float(field);
+ __ movflt(xmm0, field);
__ push(ftos);
__ jmp(Done);
@@ -561,7 +553,7 @@ void TemplateTable::condy_helper(Label& Done) {
__ cmpl(flags, dtos);
__ jccb(Assembler::notEqual, notDouble);
// dtos
- __ load_double(field);
+ __ movdbl(xmm0, field);
__ push(dtos);
__ jmp(Done);
@@ -655,13 +647,13 @@ void TemplateTable::lload() {
void TemplateTable::fload() {
transition(vtos, ftos);
locals_index(rbx);
- __ load_float(faddress(rbx));
+ __ movflt(xmm0, faddress(rbx));
}
void TemplateTable::dload() {
transition(vtos, dtos);
locals_index(rbx);
- __ load_double(daddress(rbx));
+ __ movdbl(xmm0, daddress(rbx));
}
void TemplateTable::aload() {
@@ -692,13 +684,13 @@ void TemplateTable::wide_lload() {
void TemplateTable::wide_fload() {
transition(vtos, ftos);
locals_index_wide(rbx);
- __ load_float(faddress(rbx));
+ __ movflt(xmm0, faddress(rbx));
}
void TemplateTable::wide_dload() {
transition(vtos, dtos);
locals_index_wide(rbx);
- __ load_double(daddress(rbx));
+ __ movdbl(xmm0, daddress(rbx));
}
void TemplateTable::wide_aload() {
@@ -740,7 +732,7 @@ void TemplateTable::iaload() {
__ access_load_at(T_INT, IN_HEAP | IS_ARRAY, rax,
Address(rdx, rax, Address::times_4,
arrayOopDesc::base_offset_in_bytes(T_INT)),
- noreg, noreg);
+ noreg);
}
void TemplateTable::laload() {
@@ -752,7 +744,7 @@ void TemplateTable::laload() {
__ access_load_at(T_LONG, IN_HEAP | IS_ARRAY, noreg /* ltos */,
Address(rdx, rbx, Address::times_8,
arrayOopDesc::base_offset_in_bytes(T_LONG)),
- noreg, noreg);
+ noreg);
}
@@ -766,7 +758,7 @@ void TemplateTable::faload() {
Address(rdx, rax,
Address::times_4,
arrayOopDesc::base_offset_in_bytes(T_FLOAT)),
- noreg, noreg);
+ noreg);
}
void TemplateTable::daload() {
@@ -778,7 +770,7 @@ void TemplateTable::daload() {
Address(rdx, rax,
Address::times_8,
arrayOopDesc::base_offset_in_bytes(T_DOUBLE)),
- noreg, noreg);
+ noreg);
}
void TemplateTable::aaload() {
@@ -801,7 +793,7 @@ void TemplateTable::baload() {
index_check(rdx, rax); // kills rbx
__ access_load_at(T_BYTE, IN_HEAP | IS_ARRAY, rax,
Address(rdx, rax, Address::times_1, arrayOopDesc::base_offset_in_bytes(T_BYTE)),
- noreg, noreg);
+ noreg);
}
void TemplateTable::caload() {
@@ -811,7 +803,7 @@ void TemplateTable::caload() {
index_check(rdx, rax); // kills rbx
__ access_load_at(T_CHAR, IN_HEAP | IS_ARRAY, rax,
Address(rdx, rax, Address::times_2, arrayOopDesc::base_offset_in_bytes(T_CHAR)),
- noreg, noreg);
+ noreg);
}
// iload followed by caload frequent pair
@@ -826,7 +818,7 @@ void TemplateTable::fast_icaload() {
index_check(rdx, rax); // kills rbx
__ access_load_at(T_CHAR, IN_HEAP | IS_ARRAY, rax,
Address(rdx, rax, Address::times_2, arrayOopDesc::base_offset_in_bytes(T_CHAR)),
- noreg, noreg);
+ noreg);
}
@@ -837,7 +829,7 @@ void TemplateTable::saload() {
index_check(rdx, rax); // kills rbx
__ access_load_at(T_SHORT, IN_HEAP | IS_ARRAY, rax,
Address(rdx, rax, Address::times_2, arrayOopDesc::base_offset_in_bytes(T_SHORT)),
- noreg, noreg);
+ noreg);
}
void TemplateTable::iload(int n) {
@@ -852,12 +844,12 @@ void TemplateTable::lload(int n) {
void TemplateTable::fload(int n) {
transition(vtos, ftos);
- __ load_float(faddress(n));
+ __ movflt(xmm0, faddress(n));
}
void TemplateTable::dload(int n) {
transition(vtos, dtos);
- __ load_double(daddress(n));
+ __ movdbl(xmm0, daddress(n));
}
void TemplateTable::aload(int n) {
@@ -959,13 +951,13 @@ void TemplateTable::lstore() {
void TemplateTable::fstore() {
transition(ftos, vtos);
locals_index(rbx);
- __ store_float(faddress(rbx));
+ __ movflt(faddress(rbx), xmm0);
}
void TemplateTable::dstore() {
transition(dtos, vtos);
locals_index(rbx);
- __ store_double(daddress(rbx));
+ __ movdbl(daddress(rbx), xmm0);
}
void TemplateTable::astore() {
@@ -1041,7 +1033,7 @@ void TemplateTable::lastore() {
void TemplateTable::fastore() {
transition(ftos, vtos);
__ pop_i(rbx);
- // value is in UseSSE >= 1 ? xmm0 : ST(0)
+ // value is in xmm0
// rbx: index
// rdx: array
index_check(rdx, rbx); // prefer index in rbx
@@ -1054,7 +1046,7 @@ void TemplateTable::fastore() {
void TemplateTable::dastore() {
transition(dtos, vtos);
__ pop_i(rbx);
- // value is in UseSSE >= 2 ? xmm0 : ST(0)
+ // value is in xmm0
// rbx: index
// rdx: array
index_check(rdx, rbx); // prefer index in rbx
@@ -1170,12 +1162,12 @@ void TemplateTable::lstore(int n) {
void TemplateTable::fstore(int n) {
transition(ftos, vtos);
- __ store_float(faddress(n));
+ __ movflt(faddress(n), xmm0);
}
void TemplateTable::dstore(int n) {
transition(dtos, vtos);
- __ store_double(daddress(n));
+ __ movdbl(daddress(n), xmm0);
}
@@ -1397,81 +1389,73 @@ void TemplateTable::lushr() {
void TemplateTable::fop2(Operation op) {
transition(ftos, ftos);
- if (UseSSE >= 1) {
- switch (op) {
- case add:
- __ addss(xmm0, at_rsp());
- __ addptr(rsp, Interpreter::stackElementSize);
- break;
- case sub:
- __ movflt(xmm1, xmm0);
- __ pop_f(xmm0);
- __ subss(xmm0, xmm1);
- break;
- case mul:
- __ mulss(xmm0, at_rsp());
- __ addptr(rsp, Interpreter::stackElementSize);
- break;
- case div:
- __ movflt(xmm1, xmm0);
- __ pop_f(xmm0);
- __ divss(xmm0, xmm1);
- break;
- case rem:
- // On x86_64 platforms the SharedRuntime::frem method is called to perform the
- // modulo operation. The frem method calls the function
- // double fmod(double x, double y) in math.h. The documentation of fmod states:
- // "If x or y is a NaN, a NaN is returned." without specifying what type of NaN
- // (signalling or quiet) is returned.
- __ movflt(xmm1, xmm0);
- __ pop_f(xmm0);
- __ call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::frem), 2);
- break;
- default:
- ShouldNotReachHere();
- break;
- }
- } else {
+ switch (op) {
+ case add:
+ __ addss(xmm0, at_rsp());
+ __ addptr(rsp, Interpreter::stackElementSize);
+ break;
+ case sub:
+ __ movflt(xmm1, xmm0);
+ __ pop_f(xmm0);
+ __ subss(xmm0, xmm1);
+ break;
+ case mul:
+ __ mulss(xmm0, at_rsp());
+ __ addptr(rsp, Interpreter::stackElementSize);
+ break;
+ case div:
+ __ movflt(xmm1, xmm0);
+ __ pop_f(xmm0);
+ __ divss(xmm0, xmm1);
+ break;
+ case rem:
+ // On x86_64 platforms the SharedRuntime::frem method is called to perform the
+ // modulo operation. The frem method calls the function
+ // double fmod(double x, double y) in math.h. The documentation of fmod states:
+ // "If x or y is a NaN, a NaN is returned." without specifying what type of NaN
+ // (signalling or quiet) is returned.
+ __ movflt(xmm1, xmm0);
+ __ pop_f(xmm0);
+ __ call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::frem), 2);
+ break;
+ default:
ShouldNotReachHere();
+ break;
}
}
void TemplateTable::dop2(Operation op) {
transition(dtos, dtos);
- if (UseSSE >= 2) {
- switch (op) {
- case add:
- __ addsd(xmm0, at_rsp());
- __ addptr(rsp, 2 * Interpreter::stackElementSize);
- break;
- case sub:
- __ movdbl(xmm1, xmm0);
- __ pop_d(xmm0);
- __ subsd(xmm0, xmm1);
- break;
- case mul:
- __ mulsd(xmm0, at_rsp());
- __ addptr(rsp, 2 * Interpreter::stackElementSize);
- break;
- case div:
- __ movdbl(xmm1, xmm0);
- __ pop_d(xmm0);
- __ divsd(xmm0, xmm1);
- break;
- case rem:
- // Similar to fop2(), the modulo operation is performed using the
- // SharedRuntime::drem method on x86_64 platforms for the same reasons
- // as mentioned in fop2().
- __ movdbl(xmm1, xmm0);
- __ pop_d(xmm0);
- __ call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::drem), 2);
- break;
- default:
- ShouldNotReachHere();
- break;
- }
- } else {
+ switch (op) {
+ case add:
+ __ addsd(xmm0, at_rsp());
+ __ addptr(rsp, 2 * Interpreter::stackElementSize);
+ break;
+ case sub:
+ __ movdbl(xmm1, xmm0);
+ __ pop_d(xmm0);
+ __ subsd(xmm0, xmm1);
+ break;
+ case mul:
+ __ mulsd(xmm0, at_rsp());
+ __ addptr(rsp, 2 * Interpreter::stackElementSize);
+ break;
+ case div:
+ __ movdbl(xmm1, xmm0);
+ __ pop_d(xmm0);
+ __ divsd(xmm0, xmm1);
+ break;
+ case rem:
+ // Similar to fop2(), the modulo operation is performed using the
+ // SharedRuntime::drem method on x86_64 platforms for the same reasons
+ // as mentioned in fop2().
+ __ movdbl(xmm1, xmm0);
+ __ pop_d(xmm0);
+ __ call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::drem), 2);
+ break;
+ default:
ShouldNotReachHere();
+ break;
}
}
@@ -1502,23 +1486,15 @@ static jlong double_signflip_pool[2*2];
void TemplateTable::fneg() {
transition(ftos, ftos);
- if (UseSSE >= 1) {
- static jlong *float_signflip = double_quadword(&float_signflip_pool[1], CONST64(0x8000000080000000), CONST64(0x8000000080000000));
- __ xorps(xmm0, ExternalAddress((address) float_signflip), rscratch1);
- } else {
- ShouldNotReachHere();
- }
+ static jlong *float_signflip = double_quadword(&float_signflip_pool[1], CONST64(0x8000000080000000), CONST64(0x8000000080000000));
+ __ xorps(xmm0, ExternalAddress((address) float_signflip), rscratch1);
}
void TemplateTable::dneg() {
transition(dtos, dtos);
- if (UseSSE >= 2) {
- static jlong *double_signflip =
- double_quadword(&double_signflip_pool[1], CONST64(0x8000000000000000), CONST64(0x8000000000000000));
- __ xorpd(xmm0, ExternalAddress((address) double_signflip), rscratch1);
- } else {
- ShouldNotReachHere();
- }
+ static jlong *double_signflip =
+ double_quadword(&double_signflip_pool[1], CONST64(0x8000000000000000), CONST64(0x8000000000000000));
+ __ xorpd(xmm0, ExternalAddress((address) double_signflip), rscratch1);
}
void TemplateTable::iinc() {
@@ -1682,36 +1658,31 @@ void TemplateTable::lcmp() {
}
void TemplateTable::float_cmp(bool is_float, int unordered_result) {
- if ((is_float && UseSSE >= 1) ||
- (!is_float && UseSSE >= 2)) {
- Label done;
- if (is_float) {
- // XXX get rid of pop here, use ... reg, mem32
- __ pop_f(xmm1);
- __ ucomiss(xmm1, xmm0);
- } else {
- // XXX get rid of pop here, use ... reg, mem64
- __ pop_d(xmm1);
- __ ucomisd(xmm1, xmm0);
- }
- if (unordered_result < 0) {
- __ movl(rax, -1);
- __ jccb(Assembler::parity, done);
- __ jccb(Assembler::below, done);
- __ setb(Assembler::notEqual, rdx);
- __ movzbl(rax, rdx);
- } else {
- __ movl(rax, 1);
- __ jccb(Assembler::parity, done);
- __ jccb(Assembler::above, done);
- __ movl(rax, 0);
- __ jccb(Assembler::equal, done);
- __ decrementl(rax);
- }
- __ bind(done);
+ Label done;
+ if (is_float) {
+ // XXX get rid of pop here, use ... reg, mem32
+ __ pop_f(xmm1);
+ __ ucomiss(xmm1, xmm0);
} else {
- ShouldNotReachHere();
+ // XXX get rid of pop here, use ... reg, mem64
+ __ pop_d(xmm1);
+ __ ucomisd(xmm1, xmm0);
+ }
+ if (unordered_result < 0) {
+ __ movl(rax, -1);
+ __ jccb(Assembler::parity, done);
+ __ jccb(Assembler::below, done);
+ __ setb(Assembler::notEqual, rdx);
+ __ movzbl(rax, rdx);
+ } else {
+ __ movl(rax, 1);
+ __ jccb(Assembler::parity, done);
+ __ jccb(Assembler::above, done);
+ __ movl(rax, 0);
+ __ jccb(Assembler::equal, done);
+ __ decrementl(rax);
}
+ __ bind(done);
}
void TemplateTable::branch(bool is_jsr, bool is_wide) {
@@ -2263,12 +2234,10 @@ void TemplateTable::resolve_cache_and_index_for_method(int byte_no,
if (VM_Version::supports_fast_class_init_checks() && bytecode() == Bytecodes::_invokestatic) {
const Register method = temp;
const Register klass = temp;
- const Register thread = r15_thread;
- assert(thread != noreg, "x86_32 not supported");
__ movptr(method, Address(cache, in_bytes(ResolvedMethodEntry::method_offset())));
__ load_method_holder(klass, method);
- __ clinit_barrier(klass, thread, nullptr /*L_fast_path*/, &L_clinit_barrier_slow);
+ __ clinit_barrier(klass, nullptr /*L_fast_path*/, &L_clinit_barrier_slow);
}
}
@@ -2568,7 +2537,7 @@ void TemplateTable::getfield_or_static(int byte_no, bool is_static, RewriteContr
__ jcc(Assembler::notZero, notByte);
// btos
- __ access_load_at(T_BYTE, IN_HEAP, rax, field, noreg, noreg);
+ __ access_load_at(T_BYTE, IN_HEAP, rax, field, noreg);
__ push(btos);
// Rewrite bytecode to be faster
if (!is_static && rc == may_rewrite) {
@@ -2581,7 +2550,7 @@ void TemplateTable::getfield_or_static(int byte_no, bool is_static, RewriteContr
__ jcc(Assembler::notEqual, notBool);
// ztos (same code as btos)
- __ access_load_at(T_BOOLEAN, IN_HEAP, rax, field, noreg, noreg);
+ __ access_load_at(T_BOOLEAN, IN_HEAP, rax, field, noreg);
__ push(ztos);
// Rewrite bytecode to be faster
if (!is_static && rc == may_rewrite) {
@@ -2605,7 +2574,7 @@ void TemplateTable::getfield_or_static(int byte_no, bool is_static, RewriteContr
__ cmpl(tos_state, itos);
__ jcc(Assembler::notEqual, notInt);
// itos
- __ access_load_at(T_INT, IN_HEAP, rax, field, noreg, noreg);
+ __ access_load_at(T_INT, IN_HEAP, rax, field, noreg);
__ push(itos);
// Rewrite bytecode to be faster
if (!is_static && rc == may_rewrite) {
@@ -2617,7 +2586,7 @@ void TemplateTable::getfield_or_static(int byte_no, bool is_static, RewriteContr
__ cmpl(tos_state, ctos);
__ jcc(Assembler::notEqual, notChar);
// ctos
- __ access_load_at(T_CHAR, IN_HEAP, rax, field, noreg, noreg);
+ __ access_load_at(T_CHAR, IN_HEAP, rax, field, noreg);
__ push(ctos);
// Rewrite bytecode to be faster
if (!is_static && rc == may_rewrite) {
@@ -2629,7 +2598,7 @@ void TemplateTable::getfield_or_static(int byte_no, bool is_static, RewriteContr
__ cmpl(tos_state, stos);
__ jcc(Assembler::notEqual, notShort);
// stos
- __ access_load_at(T_SHORT, IN_HEAP, rax, field, noreg, noreg);
+ __ access_load_at(T_SHORT, IN_HEAP, rax, field, noreg);
__ push(stos);
// Rewrite bytecode to be faster
if (!is_static && rc == may_rewrite) {
@@ -2643,7 +2612,7 @@ void TemplateTable::getfield_or_static(int byte_no, bool is_static, RewriteContr
// ltos
// Generate code as if volatile (x86_32). There just aren't enough registers to
// save that information and this code is faster than the test.
- __ access_load_at(T_LONG, IN_HEAP | MO_RELAXED, noreg /* ltos */, field, noreg, noreg);
+ __ access_load_at(T_LONG, IN_HEAP | MO_RELAXED, noreg /* ltos */, field, noreg);
__ push(ltos);
// Rewrite bytecode to be faster
if (!is_static && rc == may_rewrite) patch_bytecode(Bytecodes::_fast_lgetfield, bc, rbx);
@@ -2654,7 +2623,7 @@ void TemplateTable::getfield_or_static(int byte_no, bool is_static, RewriteContr
__ jcc(Assembler::notEqual, notFloat);
// ftos
- __ access_load_at(T_FLOAT, IN_HEAP, noreg /* ftos */, field, noreg, noreg);
+ __ access_load_at(T_FLOAT, IN_HEAP, noreg /* ftos */, field, noreg);
__ push(ftos);
// Rewrite bytecode to be faster
if (!is_static && rc == may_rewrite) {
@@ -2670,7 +2639,7 @@ void TemplateTable::getfield_or_static(int byte_no, bool is_static, RewriteContr
#endif
// dtos
// MO_RELAXED: for the case of volatile field, in fact it adds no extra work for the underlying implementation
- __ access_load_at(T_DOUBLE, IN_HEAP | MO_RELAXED, noreg /* dtos */, field, noreg, noreg);
+ __ access_load_at(T_DOUBLE, IN_HEAP | MO_RELAXED, noreg /* dtos */, field, noreg);
__ push(dtos);
// Rewrite bytecode to be faster
if (!is_static && rc == may_rewrite) {
@@ -3133,25 +3102,25 @@ void TemplateTable::fast_accessfield(TosState state) {
__ verify_oop(rax);
break;
case Bytecodes::_fast_lgetfield:
- __ access_load_at(T_LONG, IN_HEAP, noreg /* ltos */, field, noreg, noreg);
+ __ access_load_at(T_LONG, IN_HEAP, noreg /* ltos */, field, noreg);
break;
case Bytecodes::_fast_igetfield:
- __ access_load_at(T_INT, IN_HEAP, rax, field, noreg, noreg);
+ __ access_load_at(T_INT, IN_HEAP, rax, field, noreg);
break;
case Bytecodes::_fast_bgetfield:
- __ access_load_at(T_BYTE, IN_HEAP, rax, field, noreg, noreg);
+ __ access_load_at(T_BYTE, IN_HEAP, rax, field, noreg);
break;
case Bytecodes::_fast_sgetfield:
- __ access_load_at(T_SHORT, IN_HEAP, rax, field, noreg, noreg);
+ __ access_load_at(T_SHORT, IN_HEAP, rax, field, noreg);
break;
case Bytecodes::_fast_cgetfield:
- __ access_load_at(T_CHAR, IN_HEAP, rax, field, noreg, noreg);
+ __ access_load_at(T_CHAR, IN_HEAP, rax, field, noreg);
break;
case Bytecodes::_fast_fgetfield:
- __ access_load_at(T_FLOAT, IN_HEAP, noreg /* ftos */, field, noreg, noreg);
+ __ access_load_at(T_FLOAT, IN_HEAP, noreg /* ftos */, field, noreg);
break;
case Bytecodes::_fast_dgetfield:
- __ access_load_at(T_DOUBLE, IN_HEAP, noreg /* dtos */, field, noreg, noreg);
+ __ access_load_at(T_DOUBLE, IN_HEAP, noreg /* dtos */, field, noreg);
break;
default:
ShouldNotReachHere();
@@ -3180,14 +3149,14 @@ void TemplateTable::fast_xaccess(TosState state) {
const Address field = Address(rax, rbx, Address::times_1, 0*wordSize);
switch (state) {
case itos:
- __ access_load_at(T_INT, IN_HEAP, rax, field, noreg, noreg);
+ __ access_load_at(T_INT, IN_HEAP, rax, field, noreg);
break;
case atos:
do_oop_load(_masm, field, rax);
__ verify_oop(rax);
break;
case ftos:
- __ access_load_at(T_FLOAT, IN_HEAP, noreg /* ftos */, field, noreg, noreg);
+ __ access_load_at(T_FLOAT, IN_HEAP, noreg /* ftos */, field, noreg);
break;
default:
ShouldNotReachHere();
@@ -3572,7 +3541,7 @@ void TemplateTable::_new() {
// make sure klass is initialized
// init_state needs acquire, but x86 is TSO, and so we are already good.
assert(VM_Version::supports_fast_class_init_checks(), "must support fast class initialization checks");
- __ clinit_barrier(rcx, r15_thread, nullptr /*L_fast_path*/, &slow_case);
+ __ clinit_barrier(rcx, nullptr /*L_fast_path*/, &slow_case);
// get instance_size in InstanceKlass (scaled to a count of bytes)
__ movl(rdx, Address(rcx, Klass::layout_helper_offset()));
@@ -3590,7 +3559,7 @@ void TemplateTable::_new() {
// Go to slow path.
if (UseTLAB) {
- __ tlab_allocate(r15_thread, rax, rdx, 0, rcx, rbx, slow_case);
+ __ tlab_allocate(rax, rdx, 0, rcx, rbx, slow_case);
if (ZeroTLAB) {
// the fields have been already cleared
__ jmp(initialize_header);
@@ -3711,8 +3680,7 @@ void TemplateTable::checkcast() {
__ push(atos); // save receiver for result, and for GC
call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::quicken_io_cc));
- // vm_result_2 has metadata result
- __ get_vm_result_2(rax, r15_thread);
+ __ get_vm_result_metadata(rax);
__ pop_ptr(rdx); // restore receiver
__ jmpb(resolved);
@@ -3767,9 +3735,8 @@ void TemplateTable::instanceof() {
__ push(atos); // save receiver for result, and for GC
call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::quicken_io_cc));
- // vm_result_2 has metadata result
- __ get_vm_result_2(rax, r15_thread);
+ __ get_vm_result_metadata(rax);
__ pop_ptr(rdx); // restore receiver
__ verify_oop(rdx);
diff --git a/src/hotspot/cpu/x86/vmStructs_x86.hpp b/src/hotspot/cpu/x86/vmStructs_x86.hpp
index d894d8b09a7f2..b8089a6413e46 100644
--- a/src/hotspot/cpu/x86/vmStructs_x86.hpp
+++ b/src/hotspot/cpu/x86/vmStructs_x86.hpp
@@ -29,15 +29,20 @@
// constants required by the Serviceability Agent. This file is
// referenced by vmStructs.cpp.
-#define VM_STRUCTS_CPU(nonstatic_field, static_field, unchecked_nonstatic_field, volatile_nonstatic_field, nonproduct_nonstatic_field) \
- volatile_nonstatic_field(JavaFrameAnchor, _last_Java_fp, intptr_t*)
+#define VM_STRUCTS_CPU(nonstatic_field, static_field, unchecked_nonstatic_field, volatile_nonstatic_field, nonproduct_nonstatic_field) \
+ volatile_nonstatic_field(JavaFrameAnchor, _last_Java_fp, intptr_t*) \
+ static_field(VM_Version, _features, VM_Version::VM_Features) \
+ nonstatic_field(VM_Version::VM_Features, _features_bitmap[0], uint64_t) \
+ static_field(VM_Version::VM_Features, _features_bitmap_size, int)
#define VM_TYPES_CPU(declare_type, declare_toplevel_type, declare_oop_type, declare_integer_type, declare_unsigned_integer_type) \
+ declare_toplevel_type(VM_Version::VM_Features)
#define VM_INT_CONSTANTS_CPU(declare_constant, declare_preprocessor_constant) \
- LP64_ONLY(declare_constant(frame::arg_reg_save_area_bytes)) \
- declare_constant(frame::interpreter_frame_sender_sp_offset) \
- declare_constant(frame::interpreter_frame_last_sp_offset)
+ declare_constant(frame::arg_reg_save_area_bytes) \
+ declare_constant(frame::interpreter_frame_sender_sp_offset) \
+ declare_constant(frame::interpreter_frame_last_sp_offset) \
+ declare_constant(frame::entry_frame_call_wrapper_offset)
#define VM_LONG_CONSTANTS_CPU(declare_constant, declare_preprocessor_constant)
diff --git a/src/hotspot/cpu/x86/vm_version_x86.cpp b/src/hotspot/cpu/x86/vm_version_x86.cpp
index 4b9c1c3416a5d..fe59a1534133c 100644
--- a/src/hotspot/cpu/x86/vm_version_x86.cpp
+++ b/src/hotspot/cpu/x86/vm_version_x86.cpp
@@ -63,6 +63,11 @@ address VM_Version::_cpuinfo_cont_addr_apx = nullptr;
static BufferBlob* stub_blob;
static const int stub_size = 2000;
+int VM_Version::VM_Features::_features_bitmap_size = sizeof(VM_Version::VM_Features::_features_bitmap) / BytesPerLong;
+
+VM_Version::VM_Features VM_Version::_features;
+VM_Version::VM_Features VM_Version::_cpu_features;
+
extern "C" {
typedef void (*get_cpu_info_stub_t)(void*);
typedef void (*detect_virt_stub_t)(uint32_t, uint32_t*);
@@ -72,8 +77,6 @@ static get_cpu_info_stub_t get_cpu_info_stub = nullptr;
static detect_virt_stub_t detect_virt_stub = nullptr;
static clear_apx_test_state_t clear_apx_test_state_stub = nullptr;
-#ifdef _LP64
-
bool VM_Version::supports_clflush() {
// clflush should always be available on x86_64
// if not we are in real trouble because we rely on it
@@ -84,10 +87,9 @@ bool VM_Version::supports_clflush() {
// up. Assembler::flush calls this routine to check that clflush
// is allowed. So, we give the caller a free pass if Universe init
// is still in progress.
- assert ((!Universe::is_fully_initialized() || (_features & CPU_FLUSH) != 0), "clflush should be available");
+ assert ((!Universe::is_fully_initialized() || _features.supports_feature(CPU_FLUSH)), "clflush should be available");
return true;
}
-#endif
#define CPUID_STANDARD_FN 0x0
#define CPUID_STANDARD_FN_1 0x1
@@ -107,7 +109,6 @@ class VM_Version_StubGenerator: public StubCodeGenerator {
VM_Version_StubGenerator(CodeBuffer *c) : StubCodeGenerator(c) {}
-#if defined(_LP64)
address clear_apx_test_state() {
# define __ _masm->
address start = __ pc();
@@ -126,7 +127,6 @@ class VM_Version_StubGenerator: public StubCodeGenerator {
__ ret(0);
return start;
}
-#endif
address generate_get_cpu_info() {
// Flags to test CPU type.
@@ -138,7 +138,7 @@ class VM_Version_StubGenerator: public StubCodeGenerator {
const uint32_t CPU_FAMILY_486 = (4 << CPU_FAMILY_SHIFT);
bool use_evex = FLAG_IS_DEFAULT(UseAVX) || (UseAVX > 2);
- Label detect_486, cpu486, detect_586, std_cpuid1, std_cpuid4;
+ Label detect_486, cpu486, detect_586, std_cpuid1, std_cpuid4, std_cpuid24;
Label sef_cpuid, sefsl1_cpuid, ext_cpuid, ext_cpuid1, ext_cpuid5, ext_cpuid7;
Label ext_cpuid8, done, wrapup, vector_save_restore, apx_save_restore_warning;
Label legacy_setup, save_restore_except, legacy_save_restore, start_simd_check;
@@ -151,14 +151,10 @@ class VM_Version_StubGenerator: public StubCodeGenerator {
//
// void get_cpu_info(VM_Version::CpuidInfo* cpuid_info);
//
- // LP64: rcx and rdx are first and second argument registers on windows
+ // rcx and rdx are first and second argument registers on windows
__ push(rbp);
-#ifdef _LP64
__ mov(rbp, c_rarg0); // cpuid_info address
-#else
- __ movptr(rbp, Address(rsp, 8)); // cpuid_info address
-#endif
__ push(rbx);
__ push(rsi);
__ pushf(); // preserve rbx, and flags
@@ -341,6 +337,17 @@ class VM_Version_StubGenerator: public StubCodeGenerator {
__ movl(Address(rsi, 0), rax);
__ movl(Address(rsi, 4), rdx);
+ //
+ // cpuid(0x24) Converged Vector ISA Main Leaf (EAX = 24H, ECX = 0).
+ //
+ __ bind(std_cpuid24);
+ __ movl(rax, 0x24);
+ __ movl(rcx, 0);
+ __ cpuid();
+ __ lea(rsi, Address(rbp, in_bytes(VM_Version::std_cpuid24_offset())));
+ __ movl(Address(rsi, 0), rax);
+ __ movl(Address(rsi, 4), rbx);
+
//
// Extended cpuid(0x80000000)
//
@@ -418,7 +425,6 @@ class VM_Version_StubGenerator: public StubCodeGenerator {
__ movl(Address(rsi, 8), rcx);
__ movl(Address(rsi,12), rdx);
-#if defined(_LP64)
//
// Check if OS has enabled XGETBV instruction to access XCR0
// (OSXSAVE feature flag) and CPU supports APX
@@ -428,13 +434,11 @@ class VM_Version_StubGenerator: public StubCodeGenerator {
__ lea(rsi, Address(rbp, in_bytes(VM_Version::sefsl1_cpuid7_offset())));
__ movl(rax, 0x200000);
__ andl(rax, Address(rsi, 4));
- __ cmpl(rax, 0x200000);
- __ jcc(Assembler::notEqual, vector_save_restore);
+ __ jcc(Assembler::equal, vector_save_restore);
// check _cpuid_info.xem_xcr0_eax.bits.apx_f
__ movl(rax, 0x80000);
__ andl(rax, Address(rbp, in_bytes(VM_Version::xem_xcr0_offset()))); // xcr0 bits apx_f
- __ cmpl(rax, 0x80000);
- __ jcc(Assembler::notEqual, vector_save_restore);
+ __ jcc(Assembler::equal, vector_save_restore);
#ifndef PRODUCT
bool save_apx = UseAPX;
@@ -453,7 +457,6 @@ class VM_Version_StubGenerator: public StubCodeGenerator {
__ movq(Address(rsi, 8), r31);
UseAPX = save_apx;
-#endif
#endif
__ bind(vector_save_restore);
//
@@ -488,11 +491,15 @@ class VM_Version_StubGenerator: public StubCodeGenerator {
// If UseAVX is uninitialized or is set by the user to include EVEX
if (use_evex) {
// check _cpuid_info.sef_cpuid7_ebx.bits.avx512f
+ // OR check _cpuid_info.sefsl1_cpuid7_edx.bits.avx10
__ lea(rsi, Address(rbp, in_bytes(VM_Version::sef_cpuid7_offset())));
__ movl(rax, 0x10000);
- __ andl(rax, Address(rsi, 4)); // xcr0 bits sse | ymm
- __ cmpl(rax, 0x10000);
- __ jccb(Assembler::notEqual, legacy_setup); // jump if EVEX is not supported
+ __ andl(rax, Address(rsi, 4));
+ __ lea(rsi, Address(rbp, in_bytes(VM_Version::sefsl1_cpuid7_offset())));
+ __ movl(rbx, 0x80000);
+ __ andl(rbx, Address(rsi, 4));
+ __ orl(rax, rbx);
+ __ jccb(Assembler::equal, legacy_setup); // jump if EVEX is not supported
// check _cpuid_info.xem_xcr0_eax.bits.opmask
// check _cpuid_info.xem_xcr0_eax.bits.zmm512
// check _cpuid_info.xem_xcr0_eax.bits.zmm32
@@ -527,10 +534,8 @@ class VM_Version_StubGenerator: public StubCodeGenerator {
__ movdl(xmm0, rcx);
__ vpbroadcastd(xmm0, xmm0, Assembler::AVX_512bit);
__ evmovdqul(xmm7, xmm0, Assembler::AVX_512bit);
-#ifdef _LP64
__ evmovdqul(xmm8, xmm0, Assembler::AVX_512bit);
__ evmovdqul(xmm31, xmm0, Assembler::AVX_512bit);
-#endif
VM_Version::clean_cpuFeatures();
__ jmp(save_restore_except);
}
@@ -556,10 +561,8 @@ class VM_Version_StubGenerator: public StubCodeGenerator {
__ pshufd(xmm0, xmm0, 0x00);
__ vinsertf128_high(xmm0, xmm0);
__ vmovdqu(xmm7, xmm0);
-#ifdef _LP64
__ vmovdqu(xmm8, xmm0);
__ vmovdqu(xmm15, xmm0);
-#endif
VM_Version::clean_cpuFeatures();
__ bind(save_restore_except);
@@ -577,8 +580,7 @@ class VM_Version_StubGenerator: public StubCodeGenerator {
__ lea(rsi, Address(rbp, in_bytes(VM_Version::sef_cpuid7_offset())));
__ movl(rax, 0x10000);
__ andl(rax, Address(rsi, 4));
- __ cmpl(rax, 0x10000);
- __ jcc(Assembler::notEqual, legacy_save_restore);
+ __ jcc(Assembler::equal, legacy_save_restore);
// check _cpuid_info.xem_xcr0_eax.bits.opmask
// check _cpuid_info.xem_xcr0_eax.bits.zmm512
// check _cpuid_info.xem_xcr0_eax.bits.zmm32
@@ -600,10 +602,8 @@ class VM_Version_StubGenerator: public StubCodeGenerator {
__ lea(rsi, Address(rbp, in_bytes(VM_Version::zmm_save_offset())));
__ evmovdqul(Address(rsi, 0), xmm0, Assembler::AVX_512bit);
__ evmovdqul(Address(rsi, 64), xmm7, Assembler::AVX_512bit);
-#ifdef _LP64
__ evmovdqul(Address(rsi, 128), xmm8, Assembler::AVX_512bit);
__ evmovdqul(Address(rsi, 192), xmm31, Assembler::AVX_512bit);
-#endif
#ifdef _WINDOWS
__ evmovdqul(xmm31, Address(rsp, 0), Assembler::AVX_512bit);
@@ -628,10 +628,8 @@ class VM_Version_StubGenerator: public StubCodeGenerator {
__ lea(rsi, Address(rbp, in_bytes(VM_Version::ymm_save_offset())));
__ vmovdqu(Address(rsi, 0), xmm0);
__ vmovdqu(Address(rsi, 32), xmm7);
-#ifdef _LP64
__ vmovdqu(Address(rsi, 64), xmm8);
__ vmovdqu(Address(rsi, 96), xmm15);
-#endif
#ifdef _WINDOWS
__ vmovdqu(xmm15, Address(rsp, 0));
@@ -687,13 +685,8 @@ class VM_Version_StubGenerator: public StubCodeGenerator {
__ push(rbx);
__ push(rsi); // for Windows
-#ifdef _LP64
__ mov(rax, c_rarg0); // CPUID leaf
__ mov(rsi, c_rarg1); // register array address (eax, ebx, ecx, edx)
-#else
- __ movptr(rax, Address(rsp, 16)); // CPUID leaf
- __ movptr(rsi, Address(rsp, 20)); // register array address
-#endif
__ cpuid();
@@ -734,14 +727,10 @@ class VM_Version_StubGenerator: public StubCodeGenerator {
//
// void getCPUIDBrandString(VM_Version::CpuidInfo* cpuid_info);
//
- // LP64: rcx and rdx are first and second argument registers on windows
+ // rcx and rdx are first and second argument registers on windows
__ push(rbp);
-#ifdef _LP64
__ mov(rbp, c_rarg0); // cpuid_info address
-#else
- __ movptr(rbp, Address(rsp, 8)); // cpuid_info address
-#endif
__ push(rbx);
__ push(rsi);
__ pushf(); // preserve rbx, and flags
@@ -863,7 +852,6 @@ void VM_Version::get_processor_features() {
_cpu = 4; // 486 by default
_model = 0;
_stepping = 0;
- _features = 0;
_logical_processors_per_package = 1;
// i486 internal cache is both I&D and has a 16-byte line size
_L1_data_cache_line_size = 16;
@@ -879,7 +867,7 @@ void VM_Version::get_processor_features() {
if (cpu_family() > 4) { // it supports CPUID
_features = _cpuid_info.feature_flags(); // These can be changed by VM settings
- _cpu_features = _features; // Preserve features
+ _cpu_features = _features; // Preserve features
// Logical processors are only available on P4s and above,
// and only if hyperthreading is available.
_logical_processors_per_package = logical_processor_count();
@@ -889,19 +877,16 @@ void VM_Version::get_processor_features() {
// xchg and xadd instructions
_supports_atomic_getset4 = true;
_supports_atomic_getadd4 = true;
- LP64_ONLY(_supports_atomic_getset8 = true);
- LP64_ONLY(_supports_atomic_getadd8 = true);
+ _supports_atomic_getset8 = true;
+ _supports_atomic_getadd8 = true;
-#ifdef _LP64
// OS should support SSE for x64 and hardware should support at least SSE2.
if (!VM_Version::supports_sse2()) {
vm_exit_during_initialization("Unknown x64 processor: SSE2 not supported");
}
// in 64 bit the use of SSE2 is the minimum
if (UseSSE < 2) UseSSE = 2;
-#endif
-#ifdef AMD64
// flush_icache_stub have to be generated first.
// That is why Icache line size is hard coded in ICache class,
// see icache_x86.hpp. It is also the reason why we can't use
@@ -913,9 +898,7 @@ void VM_Version::get_processor_features() {
guarantee(_cpuid_info.std_cpuid1_edx.bits.clflush != 0, "clflush is not supported");
// clflush_size is size in quadwords (8 bytes).
guarantee(_cpuid_info.std_cpuid1_ebx.bits.clflush_size == 8, "such clflush size is not supported");
-#endif
-#ifdef _LP64
// assigning this field effectively enables Unsafe.writebackMemory()
// by initing UnsafeConstant.DATA_CACHE_LINE_FLUSH_SIZE to non-zero
// that is only implemented on x86_64 and only if the OS plays ball
@@ -924,7 +907,6 @@ void VM_Version::get_processor_features() {
// let if default to zero thereby disabling writeback
_data_cache_line_flush_size = _cpuid_info.std_cpuid1_ebx.bits.clflush_size * 8;
}
-#endif
// Check if processor has Intel Ecore
if (FLAG_IS_DEFAULT(EnableX86ECoreOpts) && is_intel() && cpu_family() == 6 &&
@@ -934,21 +916,21 @@ void VM_Version::get_processor_features() {
}
if (UseSSE < 4) {
- _features &= ~CPU_SSE4_1;
- _features &= ~CPU_SSE4_2;
+ _features.clear_feature(CPU_SSE4_1);
+ _features.clear_feature(CPU_SSE4_2);
}
if (UseSSE < 3) {
- _features &= ~CPU_SSE3;
- _features &= ~CPU_SSSE3;
- _features &= ~CPU_SSE4A;
+ _features.clear_feature(CPU_SSE3);
+ _features.clear_feature(CPU_SSSE3);
+ _features.clear_feature(CPU_SSE4A);
}
if (UseSSE < 2)
- _features &= ~CPU_SSE2;
+ _features.clear_feature(CPU_SSE2);
if (UseSSE < 1)
- _features &= ~CPU_SSE;
+ _features.clear_feature(CPU_SSE);
//since AVX instructions is slower than SSE in some ZX cpus, force USEAVX=0.
if (is_zx() && ((cpu_family() == 6) || (cpu_family() == 7))) {
@@ -1014,21 +996,25 @@ void VM_Version::get_processor_features() {
}
if (UseAVX < 3) {
- _features &= ~CPU_AVX512F;
- _features &= ~CPU_AVX512DQ;
- _features &= ~CPU_AVX512CD;
- _features &= ~CPU_AVX512BW;
- _features &= ~CPU_AVX512VL;
- _features &= ~CPU_AVX512_VPOPCNTDQ;
- _features &= ~CPU_AVX512_VPCLMULQDQ;
- _features &= ~CPU_AVX512_VAES;
- _features &= ~CPU_AVX512_VNNI;
- _features &= ~CPU_AVX512_VBMI;
- _features &= ~CPU_AVX512_VBMI2;
- _features &= ~CPU_AVX512_BITALG;
- _features &= ~CPU_AVX512_IFMA;
- _features &= ~CPU_APX_F;
- _features &= ~CPU_AVX512_FP16;
+ _features.clear_feature(CPU_AVX512F);
+ _features.clear_feature(CPU_AVX512DQ);
+ _features.clear_feature(CPU_AVX512CD);
+ _features.clear_feature(CPU_AVX512BW);
+ _features.clear_feature(CPU_AVX512ER);
+ _features.clear_feature(CPU_AVX512PF);
+ _features.clear_feature(CPU_AVX512VL);
+ _features.clear_feature(CPU_AVX512_VPOPCNTDQ);
+ _features.clear_feature(CPU_AVX512_VPCLMULQDQ);
+ _features.clear_feature(CPU_AVX512_VAES);
+ _features.clear_feature(CPU_AVX512_VNNI);
+ _features.clear_feature(CPU_AVX512_VBMI);
+ _features.clear_feature(CPU_AVX512_VBMI2);
+ _features.clear_feature(CPU_AVX512_BITALG);
+ _features.clear_feature(CPU_AVX512_IFMA);
+ _features.clear_feature(CPU_APX_F);
+ _features.clear_feature(CPU_AVX512_FP16);
+ _features.clear_feature(CPU_AVX10_1);
+ _features.clear_feature(CPU_AVX10_2);
}
// Currently APX support is only enabled for targets supporting AVX512VL feature.
@@ -1041,45 +1027,47 @@ void VM_Version::get_processor_features() {
}
if (!UseAPX) {
- _features &= ~CPU_APX_F;
+ _features.clear_feature(CPU_APX_F);
}
if (UseAVX < 2) {
- _features &= ~CPU_AVX2;
- _features &= ~CPU_AVX_IFMA;
+ _features.clear_feature(CPU_AVX2);
+ _features.clear_feature(CPU_AVX_IFMA);
}
if (UseAVX < 1) {
- _features &= ~CPU_AVX;
- _features &= ~CPU_VZEROUPPER;
- _features &= ~CPU_F16C;
- _features &= ~CPU_SHA512;
+ _features.clear_feature(CPU_AVX);
+ _features.clear_feature(CPU_VZEROUPPER);
+ _features.clear_feature(CPU_F16C);
+ _features.clear_feature(CPU_SHA512);
}
if (logical_processors_per_package() == 1) {
// HT processor could be installed on a system which doesn't support HT.
- _features &= ~CPU_HT;
+ _features.clear_feature(CPU_HT);
}
if (is_intel()) { // Intel cpus specific settings
if (is_knights_family()) {
- _features &= ~CPU_VZEROUPPER;
- _features &= ~CPU_AVX512BW;
- _features &= ~CPU_AVX512VL;
- _features &= ~CPU_AVX512DQ;
- _features &= ~CPU_AVX512_VNNI;
- _features &= ~CPU_AVX512_VAES;
- _features &= ~CPU_AVX512_VPOPCNTDQ;
- _features &= ~CPU_AVX512_VPCLMULQDQ;
- _features &= ~CPU_AVX512_VBMI;
- _features &= ~CPU_AVX512_VBMI2;
- _features &= ~CPU_CLWB;
- _features &= ~CPU_FLUSHOPT;
- _features &= ~CPU_GFNI;
- _features &= ~CPU_AVX512_BITALG;
- _features &= ~CPU_AVX512_IFMA;
- _features &= ~CPU_AVX_IFMA;
- _features &= ~CPU_AVX512_FP16;
+ _features.clear_feature(CPU_VZEROUPPER);
+ _features.clear_feature(CPU_AVX512BW);
+ _features.clear_feature(CPU_AVX512VL);
+ _features.clear_feature(CPU_AVX512DQ);
+ _features.clear_feature(CPU_AVX512_VNNI);
+ _features.clear_feature(CPU_AVX512_VAES);
+ _features.clear_feature(CPU_AVX512_VPOPCNTDQ);
+ _features.clear_feature(CPU_AVX512_VPCLMULQDQ);
+ _features.clear_feature(CPU_AVX512_VBMI);
+ _features.clear_feature(CPU_AVX512_VBMI2);
+ _features.clear_feature(CPU_CLWB);
+ _features.clear_feature(CPU_FLUSHOPT);
+ _features.clear_feature(CPU_GFNI);
+ _features.clear_feature(CPU_AVX512_BITALG);
+ _features.clear_feature(CPU_AVX512_IFMA);
+ _features.clear_feature(CPU_AVX_IFMA);
+ _features.clear_feature(CPU_AVX512_FP16);
+ _features.clear_feature(CPU_AVX10_1);
+ _features.clear_feature(CPU_AVX10_2);
}
}
@@ -1089,16 +1077,44 @@ void VM_Version::get_processor_features() {
_has_intel_jcc_erratum = IntelJccErratumMitigation;
}
- char buf[1024];
- int res = jio_snprintf(
+ assert(supports_clflush(), "Always present");
+ if (X86ICacheSync == -1) {
+ // Auto-detect, choosing the best performant one that still flushes
+ // the cache. We could switch to CPUID/SERIALIZE ("4"/"5") going forward.
+ if (supports_clwb()) {
+ FLAG_SET_ERGO(X86ICacheSync, 3);
+ } else if (supports_clflushopt()) {
+ FLAG_SET_ERGO(X86ICacheSync, 2);
+ } else {
+ FLAG_SET_ERGO(X86ICacheSync, 1);
+ }
+ } else {
+ if ((X86ICacheSync == 2) && !supports_clflushopt()) {
+ vm_exit_during_initialization("CPU does not support CLFLUSHOPT, unable to use X86ICacheSync=2");
+ }
+ if ((X86ICacheSync == 3) && !supports_clwb()) {
+ vm_exit_during_initialization("CPU does not support CLWB, unable to use X86ICacheSync=3");
+ }
+ if ((X86ICacheSync == 5) && !supports_serialize()) {
+ vm_exit_during_initialization("CPU does not support SERIALIZE, unable to use X86ICacheSync=5");
+ }
+ }
+
+ char buf[2048];
+ size_t cpu_info_size = jio_snprintf(
buf, sizeof(buf),
"(%u cores per cpu, %u threads per core) family %d model %d stepping %d microcode 0x%x",
cores_per_cpu(), threads_per_core(),
cpu_family(), _model, _stepping, os::cpu_microcode_revision());
- assert(res > 0, "not enough temporary space allocated");
- insert_features_names(buf + res, sizeof(buf) - res, _features_names);
+ assert(cpu_info_size > 0, "not enough temporary space allocated");
+
+ insert_features_names(_features, buf + cpu_info_size, sizeof(buf) - cpu_info_size);
+
+ _cpu_info_string = os::strdup(buf);
- _features_string = os::strdup(buf);
+ _features_string = extract_features_string(_cpu_info_string,
+ strnlen(_cpu_info_string, sizeof(buf)),
+ cpu_info_size);
// Use AES instructions if available.
if (supports_aes()) {
@@ -1182,7 +1198,6 @@ void VM_Version::get_processor_features() {
FLAG_SET_DEFAULT(UseCRC32Intrinsics, false);
}
-#ifdef _LP64
if (supports_avx2()) {
if (FLAG_IS_DEFAULT(UseAdler32Intrinsics)) {
UseAdler32Intrinsics = true;
@@ -1193,12 +1208,6 @@ void VM_Version::get_processor_features() {
}
FLAG_SET_DEFAULT(UseAdler32Intrinsics, false);
}
-#else
- if (UseAdler32Intrinsics) {
- warning("Adler32Intrinsics not available on this CPU.");
- FLAG_SET_DEFAULT(UseAdler32Intrinsics, false);
- }
-#endif
if (supports_sse4_2() && supports_clmul()) {
if (FLAG_IS_DEFAULT(UseCRC32CIntrinsics)) {
@@ -1222,7 +1231,6 @@ void VM_Version::get_processor_features() {
FLAG_SET_DEFAULT(UseGHASHIntrinsics, false);
}
-#ifdef _LP64
// ChaCha20 Intrinsics
// As long as the system supports AVX as a baseline we can do a
// SIMD-enabled block function. StubGenerator makes the determination
@@ -1238,13 +1246,17 @@ void VM_Version::get_processor_features() {
}
FLAG_SET_DEFAULT(UseChaCha20Intrinsics, false);
}
-#else
- // No support currently for ChaCha20 intrinsics on 32-bit platforms
- if (UseChaCha20Intrinsics) {
- warning("ChaCha20 intrinsics are not available on this CPU.");
- FLAG_SET_DEFAULT(UseChaCha20Intrinsics, false);
+
+ // Dilithium Intrinsics
+ // Currently we only have them for AVX512
+ if (supports_evex() && supports_avx512bw()) {
+ if (FLAG_IS_DEFAULT(UseDilithiumIntrinsics)) {
+ UseDilithiumIntrinsics = true;
+ }
+ } else if (UseDilithiumIntrinsics) {
+ warning("Intrinsics for ML-DSA are not available on this CPU.");
+ FLAG_SET_DEFAULT(UseDilithiumIntrinsics, false);
}
-#endif // _LP64
// Base64 Intrinsics (Check the condition for which the intrinsic will be active)
if (UseAVX >= 2) {
@@ -1257,7 +1269,7 @@ void VM_Version::get_processor_features() {
FLAG_SET_DEFAULT(UseBASE64Intrinsics, false);
}
- if (supports_fma() && UseSSE >= 2) { // Check UseSSE since FMA code uses SSE instructions
+ if (supports_fma()) {
if (FLAG_IS_DEFAULT(UseFMA)) {
UseFMA = true;
}
@@ -1270,7 +1282,7 @@ void VM_Version::get_processor_features() {
UseMD5Intrinsics = true;
}
- if (supports_sha() LP64_ONLY(|| (supports_avx2() && supports_bmi2()))) {
+ if (supports_sha() || (supports_avx2() && supports_bmi2())) {
if (FLAG_IS_DEFAULT(UseSHA)) {
UseSHA = true;
}
@@ -1297,27 +1309,20 @@ void VM_Version::get_processor_features() {
FLAG_SET_DEFAULT(UseSHA256Intrinsics, false);
}
-#ifdef _LP64
- // These are only supported on 64-bit
if (UseSHA && supports_avx2() && (supports_bmi2() || supports_sha512())) {
if (FLAG_IS_DEFAULT(UseSHA512Intrinsics)) {
FLAG_SET_DEFAULT(UseSHA512Intrinsics, true);
}
- } else
-#endif
- if (UseSHA512Intrinsics) {
+ } else if (UseSHA512Intrinsics) {
warning("Intrinsics for SHA-384 and SHA-512 crypto hash functions not available on this CPU.");
FLAG_SET_DEFAULT(UseSHA512Intrinsics, false);
}
-#ifdef _LP64
if (supports_evex() && supports_avx512bw()) {
if (FLAG_IS_DEFAULT(UseSHA3Intrinsics)) {
UseSHA3Intrinsics = true;
}
- } else
-#endif
- if (UseSHA3Intrinsics) {
+ } else if (UseSHA3Intrinsics) {
warning("Intrinsics for SHA3-224, SHA3-256, SHA3-384 and SHA3-512 crypto hash functions not available on this CPU.");
FLAG_SET_DEFAULT(UseSHA3Intrinsics, false);
}
@@ -1326,22 +1331,9 @@ void VM_Version::get_processor_features() {
FLAG_SET_DEFAULT(UseSHA, false);
}
-#ifdef COMPILER2
- if (UseFPUForSpilling) {
- if (UseSSE < 2) {
- // Only supported with SSE2+
- FLAG_SET_DEFAULT(UseFPUForSpilling, false);
- }
- }
-#endif
-
#if COMPILER2_OR_JVMCI
int max_vector_size = 0;
- if (UseSSE < 2) {
- // Vectors (in XMM) are only supported with SSE2+
- // SSE is always 2 on x64.
- max_vector_size = 0;
- } else if (UseAVX == 0 || !os_supports_avx_vectors()) {
+ if (UseAVX == 0 || !os_supports_avx_vectors()) {
// 16 byte vectors (in XMM) are supported with SSE2+
max_vector_size = 16;
} else if (UseAVX == 1 || UseAVX == 2) {
@@ -1352,11 +1344,7 @@ void VM_Version::get_processor_features() {
max_vector_size = 64;
}
-#ifdef _LP64
int min_vector_size = 4; // We require MaxVectorSize to be at least 4 on 64bit
-#else
- int min_vector_size = 0;
-#endif
if (!FLAG_IS_DEFAULT(MaxVectorSize)) {
if (MaxVectorSize < min_vector_size) {
@@ -1380,7 +1368,7 @@ void VM_Version::get_processor_features() {
if (MaxVectorSize > 0) {
if (supports_avx() && PrintMiscellaneous && Verbose && TraceNewVectors) {
tty->print_cr("State of YMM registers after signal handle:");
- int nreg = 2 LP64_ONLY(+2);
+ int nreg = 4;
const char* ymm_name[4] = {"0", "7", "8", "15"};
for (int i = 0; i < nreg; i++) {
tty->print("YMM%s:", ymm_name[i]);
@@ -1393,31 +1381,24 @@ void VM_Version::get_processor_features() {
}
#endif // COMPILER2 && ASSERT
-#ifdef _LP64
if ((supports_avx512ifma() && supports_avx512vlbw()) || supports_avxifma()) {
if (FLAG_IS_DEFAULT(UsePoly1305Intrinsics)) {
FLAG_SET_DEFAULT(UsePoly1305Intrinsics, true);
}
- } else
-#endif
- if (UsePoly1305Intrinsics) {
+ } else if (UsePoly1305Intrinsics) {
warning("Intrinsics for Poly1305 crypto hash functions not available on this CPU.");
FLAG_SET_DEFAULT(UsePoly1305Intrinsics, false);
}
-#ifdef _LP64
if ((supports_avx512ifma() && supports_avx512vlbw()) || supports_avxifma()) {
if (FLAG_IS_DEFAULT(UseIntPolyIntrinsics)) {
FLAG_SET_DEFAULT(UseIntPolyIntrinsics, true);
}
- } else
-#endif
- if (UseIntPolyIntrinsics) {
+ } else if (UseIntPolyIntrinsics) {
warning("Intrinsics for Polynomial crypto functions not available on this CPU.");
FLAG_SET_DEFAULT(UseIntPolyIntrinsics, false);
}
-#ifdef _LP64
if (FLAG_IS_DEFAULT(UseMultiplyToLenIntrinsic)) {
UseMultiplyToLenIntrinsic = true;
}
@@ -1433,38 +1414,6 @@ void VM_Version::get_processor_features() {
if (FLAG_IS_DEFAULT(UseMontgomerySquareIntrinsic)) {
UseMontgomerySquareIntrinsic = true;
}
-#else
- if (UseMultiplyToLenIntrinsic) {
- if (!FLAG_IS_DEFAULT(UseMultiplyToLenIntrinsic)) {
- warning("multiplyToLen intrinsic is not available in 32-bit VM");
- }
- FLAG_SET_DEFAULT(UseMultiplyToLenIntrinsic, false);
- }
- if (UseMontgomeryMultiplyIntrinsic) {
- if (!FLAG_IS_DEFAULT(UseMontgomeryMultiplyIntrinsic)) {
- warning("montgomeryMultiply intrinsic is not available in 32-bit VM");
- }
- FLAG_SET_DEFAULT(UseMontgomeryMultiplyIntrinsic, false);
- }
- if (UseMontgomerySquareIntrinsic) {
- if (!FLAG_IS_DEFAULT(UseMontgomerySquareIntrinsic)) {
- warning("montgomerySquare intrinsic is not available in 32-bit VM");
- }
- FLAG_SET_DEFAULT(UseMontgomerySquareIntrinsic, false);
- }
- if (UseSquareToLenIntrinsic) {
- if (!FLAG_IS_DEFAULT(UseSquareToLenIntrinsic)) {
- warning("squareToLen intrinsic is not available in 32-bit VM");
- }
- FLAG_SET_DEFAULT(UseSquareToLenIntrinsic, false);
- }
- if (UseMulAddIntrinsic) {
- if (!FLAG_IS_DEFAULT(UseMulAddIntrinsic)) {
- warning("mulAdd intrinsic is not available in 32-bit VM");
- }
- FLAG_SET_DEFAULT(UseMulAddIntrinsic, false);
- }
-#endif // _LP64
#endif // COMPILER2_OR_JVMCI
// On new cpus instructions which update whole XMM register should be used
@@ -1741,7 +1690,6 @@ void VM_Version::get_processor_features() {
}
#endif
-#ifdef _LP64
if (UseSSE42Intrinsics) {
if (FLAG_IS_DEFAULT(UseVectorizedMismatchIntrinsic)) {
UseVectorizedMismatchIntrinsic = true;
@@ -1758,20 +1706,6 @@ void VM_Version::get_processor_features() {
warning("vectorizedHashCode intrinsics are not available on this CPU");
FLAG_SET_DEFAULT(UseVectorizedHashCodeIntrinsic, false);
}
-#else
- if (UseVectorizedMismatchIntrinsic) {
- if (!FLAG_IS_DEFAULT(UseVectorizedMismatchIntrinsic)) {
- warning("vectorizedMismatch intrinsic is not available in 32-bit VM");
- }
- FLAG_SET_DEFAULT(UseVectorizedMismatchIntrinsic, false);
- }
- if (UseVectorizedHashCodeIntrinsic) {
- if (!FLAG_IS_DEFAULT(UseVectorizedHashCodeIntrinsic)) {
- warning("vectorizedHashCode intrinsic is not available in 32-bit VM");
- }
- FLAG_SET_DEFAULT(UseVectorizedHashCodeIntrinsic, false);
- }
-#endif // _LP64
// Use count leading zeros count instruction if available.
if (supports_lzcnt()) {
@@ -1856,7 +1790,7 @@ void VM_Version::get_processor_features() {
#endif
// Use XMM/YMM MOVDQU instruction for Object Initialization
- if (!UseFastStosb && UseSSE >= 2 && UseUnalignedLoadStores) {
+ if (!UseFastStosb && UseUnalignedLoadStores) {
if (FLAG_IS_DEFAULT(UseXMMForObjInit)) {
UseXMMForObjInit = true;
}
@@ -1920,7 +1854,6 @@ void VM_Version::get_processor_features() {
#endif
}
-#ifdef _LP64
// Prefetch settings
// Prefetch interval for gc copy/scan == 9 dcache lines. Derived from
@@ -1939,7 +1872,6 @@ void VM_Version::get_processor_features() {
if (FLAG_IS_DEFAULT(PrefetchScanIntervalInBytes)) {
FLAG_SET_DEFAULT(PrefetchScanIntervalInBytes, 576);
}
-#endif
if (FLAG_IS_DEFAULT(ContendedPaddingWidth) &&
(cache_line_size > ContendedPaddingWidth))
@@ -1971,22 +1903,18 @@ void VM_Version::get_processor_features() {
#endif
log->cr();
log->print("Allocation");
- if (AllocatePrefetchStyle <= 0 || (UseSSE == 0 && !supports_3dnow_prefetch())) {
+ if (AllocatePrefetchStyle <= 0) {
log->print_cr(": no prefetching");
} else {
log->print(" prefetching: ");
- if (UseSSE == 0 && supports_3dnow_prefetch()) {
+ if (AllocatePrefetchInstr == 0) {
+ log->print("PREFETCHNTA");
+ } else if (AllocatePrefetchInstr == 1) {
+ log->print("PREFETCHT0");
+ } else if (AllocatePrefetchInstr == 2) {
+ log->print("PREFETCHT2");
+ } else if (AllocatePrefetchInstr == 3) {
log->print("PREFETCHW");
- } else if (UseSSE >= 1) {
- if (AllocatePrefetchInstr == 0) {
- log->print("PREFETCHNTA");
- } else if (AllocatePrefetchInstr == 1) {
- log->print("PREFETCHT0");
- } else if (AllocatePrefetchInstr == 2) {
- log->print("PREFETCHT2");
- } else if (AllocatePrefetchInstr == 3) {
- log->print("PREFETCHW");
- }
}
if (AllocatePrefetchLines > 1) {
log->print_cr(" at distance %d, %d lines of %d bytes", AllocatePrefetchDistance, AllocatePrefetchLines, AllocatePrefetchStepSize);
@@ -2174,16 +2102,15 @@ int VM_Version::avx3_threshold() {
FLAG_IS_DEFAULT(AVX3Threshold)) ? 0 : AVX3Threshold;
}
-#if defined(_LP64)
void VM_Version::clear_apx_test_state() {
clear_apx_test_state_stub();
}
-#endif
static bool _vm_version_initialized = false;
void VM_Version::initialize() {
ResourceMark rm;
+
// Making this stub must be FIRST use of assembler
stub_blob = BufferBlob::create("VM_Version stub", stub_size);
if (stub_blob == nullptr) {
@@ -2196,14 +2123,11 @@ void VM_Version::initialize() {
g.generate_get_cpu_info());
detect_virt_stub = CAST_TO_FN_PTR(detect_virt_stub_t,
g.generate_detect_virt());
-
-#if defined(_LP64)
clear_apx_test_state_stub = CAST_TO_FN_PTR(clear_apx_test_state_t,
g.clear_apx_test_state());
-#endif
get_processor_features();
- LP64_ONLY(Assembler::precompute_instructions();)
+ Assembler::precompute_instructions();
if (VM_Version::supports_hv()) { // Supports hypervisor
check_virtualizations();
@@ -2962,200 +2886,217 @@ int64_t VM_Version::maximum_qualified_cpu_frequency(void) {
return _max_qualified_cpu_frequency;
}
-uint64_t VM_Version::CpuidInfo::feature_flags() const {
- uint64_t result = 0;
+VM_Version::VM_Features VM_Version::CpuidInfo::feature_flags() const {
+ VM_Features vm_features;
if (std_cpuid1_edx.bits.cmpxchg8 != 0)
- result |= CPU_CX8;
+ vm_features.set_feature(CPU_CX8);
if (std_cpuid1_edx.bits.cmov != 0)
- result |= CPU_CMOV;
+ vm_features.set_feature(CPU_CMOV);
if (std_cpuid1_edx.bits.clflush != 0)
- result |= CPU_FLUSH;
-#ifdef _LP64
+ vm_features.set_feature(CPU_FLUSH);
// clflush should always be available on x86_64
// if not we are in real trouble because we rely on it
// to flush the code cache.
- assert ((result & CPU_FLUSH) != 0, "clflush should be available");
-#endif
+ assert (vm_features.supports_feature(CPU_FLUSH), "clflush should be available");
if (std_cpuid1_edx.bits.fxsr != 0 || (is_amd_family() &&
ext_cpuid1_edx.bits.fxsr != 0))
- result |= CPU_FXSR;
+ vm_features.set_feature(CPU_FXSR);
// HT flag is set for multi-core processors also.
if (threads_per_core() > 1)
- result |= CPU_HT;
+ vm_features.set_feature(CPU_HT);
if (std_cpuid1_edx.bits.mmx != 0 || (is_amd_family() &&
ext_cpuid1_edx.bits.mmx != 0))
- result |= CPU_MMX;
+ vm_features.set_feature(CPU_MMX);
if (std_cpuid1_edx.bits.sse != 0)
- result |= CPU_SSE;
+ vm_features.set_feature(CPU_SSE);
if (std_cpuid1_edx.bits.sse2 != 0)
- result |= CPU_SSE2;
+ vm_features.set_feature(CPU_SSE2);
if (std_cpuid1_ecx.bits.sse3 != 0)
- result |= CPU_SSE3;
+ vm_features.set_feature(CPU_SSE3);
if (std_cpuid1_ecx.bits.ssse3 != 0)
- result |= CPU_SSSE3;
+ vm_features.set_feature(CPU_SSSE3);
if (std_cpuid1_ecx.bits.sse4_1 != 0)
- result |= CPU_SSE4_1;
+ vm_features.set_feature(CPU_SSE4_1);
if (std_cpuid1_ecx.bits.sse4_2 != 0)
- result |= CPU_SSE4_2;
+ vm_features.set_feature(CPU_SSE4_2);
if (std_cpuid1_ecx.bits.popcnt != 0)
- result |= CPU_POPCNT;
+ vm_features.set_feature(CPU_POPCNT);
if (sefsl1_cpuid7_edx.bits.apx_f != 0 &&
xem_xcr0_eax.bits.apx_f != 0) {
- result |= CPU_APX_F;
+ vm_features.set_feature(CPU_APX_F);
}
if (std_cpuid1_ecx.bits.avx != 0 &&
std_cpuid1_ecx.bits.osxsave != 0 &&
xem_xcr0_eax.bits.sse != 0 &&
xem_xcr0_eax.bits.ymm != 0) {
- result |= CPU_AVX;
- result |= CPU_VZEROUPPER;
+ vm_features.set_feature(CPU_AVX);
+ vm_features.set_feature(CPU_VZEROUPPER);
if (sefsl1_cpuid7_eax.bits.sha512 != 0)
- result |= CPU_SHA512;
+ vm_features.set_feature(CPU_SHA512);
if (std_cpuid1_ecx.bits.f16c != 0)
- result |= CPU_F16C;
+ vm_features.set_feature(CPU_F16C);
if (sef_cpuid7_ebx.bits.avx2 != 0) {
- result |= CPU_AVX2;
+ vm_features.set_feature(CPU_AVX2);
if (sefsl1_cpuid7_eax.bits.avx_ifma != 0)
- result |= CPU_AVX_IFMA;
+ vm_features.set_feature(CPU_AVX_IFMA);
}
if (sef_cpuid7_ecx.bits.gfni != 0)
- result |= CPU_GFNI;
+ vm_features.set_feature(CPU_GFNI);
if (sef_cpuid7_ebx.bits.avx512f != 0 &&
xem_xcr0_eax.bits.opmask != 0 &&
xem_xcr0_eax.bits.zmm512 != 0 &&
xem_xcr0_eax.bits.zmm32 != 0) {
- result |= CPU_AVX512F;
+ vm_features.set_feature(CPU_AVX512F);
if (sef_cpuid7_ebx.bits.avx512cd != 0)
- result |= CPU_AVX512CD;
+ vm_features.set_feature(CPU_AVX512CD);
if (sef_cpuid7_ebx.bits.avx512dq != 0)
- result |= CPU_AVX512DQ;
+ vm_features.set_feature(CPU_AVX512DQ);
if (sef_cpuid7_ebx.bits.avx512ifma != 0)
- result |= CPU_AVX512_IFMA;
+ vm_features.set_feature(CPU_AVX512_IFMA);
if (sef_cpuid7_ebx.bits.avx512pf != 0)
- result |= CPU_AVX512PF;
+ vm_features.set_feature(CPU_AVX512PF);
if (sef_cpuid7_ebx.bits.avx512er != 0)
- result |= CPU_AVX512ER;
+ vm_features.set_feature(CPU_AVX512ER);
if (sef_cpuid7_ebx.bits.avx512bw != 0)
- result |= CPU_AVX512BW;
+ vm_features.set_feature(CPU_AVX512BW);
if (sef_cpuid7_ebx.bits.avx512vl != 0)
- result |= CPU_AVX512VL;
+ vm_features.set_feature(CPU_AVX512VL);
if (sef_cpuid7_ecx.bits.avx512_vpopcntdq != 0)
- result |= CPU_AVX512_VPOPCNTDQ;
+ vm_features.set_feature(CPU_AVX512_VPOPCNTDQ);
if (sef_cpuid7_ecx.bits.avx512_vpclmulqdq != 0)
- result |= CPU_AVX512_VPCLMULQDQ;
+ vm_features.set_feature(CPU_AVX512_VPCLMULQDQ);
if (sef_cpuid7_ecx.bits.vaes != 0)
- result |= CPU_AVX512_VAES;
+ vm_features.set_feature(CPU_AVX512_VAES);
if (sef_cpuid7_ecx.bits.avx512_vnni != 0)
- result |= CPU_AVX512_VNNI;
+ vm_features.set_feature(CPU_AVX512_VNNI);
if (sef_cpuid7_ecx.bits.avx512_bitalg != 0)
- result |= CPU_AVX512_BITALG;
+ vm_features.set_feature(CPU_AVX512_BITALG);
if (sef_cpuid7_ecx.bits.avx512_vbmi != 0)
- result |= CPU_AVX512_VBMI;
+ vm_features.set_feature(CPU_AVX512_VBMI);
if (sef_cpuid7_ecx.bits.avx512_vbmi2 != 0)
- result |= CPU_AVX512_VBMI2;
+ vm_features.set_feature(CPU_AVX512_VBMI2);
+ }
+ if (is_intel()) {
+ if (sefsl1_cpuid7_edx.bits.avx10 != 0 &&
+ std_cpuid24_ebx.bits.avx10_vlen_512 !=0 &&
+ std_cpuid24_ebx.bits.avx10_converged_isa_version >= 1 &&
+ xem_xcr0_eax.bits.opmask != 0 &&
+ xem_xcr0_eax.bits.zmm512 != 0 &&
+ xem_xcr0_eax.bits.zmm32 != 0) {
+ vm_features.set_feature(CPU_AVX10_1);
+ vm_features.set_feature(CPU_AVX512F);
+ vm_features.set_feature(CPU_AVX512CD);
+ vm_features.set_feature(CPU_AVX512DQ);
+ vm_features.set_feature(CPU_AVX512PF);
+ vm_features.set_feature(CPU_AVX512ER);
+ vm_features.set_feature(CPU_AVX512BW);
+ vm_features.set_feature(CPU_AVX512VL);
+ vm_features.set_feature(CPU_AVX512_VPOPCNTDQ);
+ vm_features.set_feature(CPU_AVX512_VPCLMULQDQ);
+ vm_features.set_feature(CPU_AVX512_VAES);
+ vm_features.set_feature(CPU_AVX512_VNNI);
+ vm_features.set_feature(CPU_AVX512_BITALG);
+ vm_features.set_feature(CPU_AVX512_VBMI);
+ vm_features.set_feature(CPU_AVX512_VBMI2);
+ if (std_cpuid24_ebx.bits.avx10_converged_isa_version >= 2) {
+ vm_features.set_feature(CPU_AVX10_2);
+ }
+ }
}
}
+
if (std_cpuid1_ecx.bits.hv != 0)
- result |= CPU_HV;
+ vm_features.set_feature(CPU_HV);
if (sef_cpuid7_ebx.bits.bmi1 != 0)
- result |= CPU_BMI1;
+ vm_features.set_feature(CPU_BMI1);
if (std_cpuid1_edx.bits.tsc != 0)
- result |= CPU_TSC;
+ vm_features.set_feature(CPU_TSC);
if (ext_cpuid7_edx.bits.tsc_invariance != 0)
- result |= CPU_TSCINV_BIT;
+ vm_features.set_feature(CPU_TSCINV_BIT);
if (std_cpuid1_ecx.bits.aes != 0)
- result |= CPU_AES;
+ vm_features.set_feature(CPU_AES);
+ if (ext_cpuid1_ecx.bits.lzcnt != 0)
+ vm_features.set_feature(CPU_LZCNT);
+ if (ext_cpuid1_ecx.bits.prefetchw != 0)
+ vm_features.set_feature(CPU_3DNOW_PREFETCH);
if (sef_cpuid7_ebx.bits.erms != 0)
- result |= CPU_ERMS;
+ vm_features.set_feature(CPU_ERMS);
if (sef_cpuid7_edx.bits.fast_short_rep_mov != 0)
- result |= CPU_FSRM;
+ vm_features.set_feature(CPU_FSRM);
if (std_cpuid1_ecx.bits.clmul != 0)
- result |= CPU_CLMUL;
+ vm_features.set_feature(CPU_CLMUL);
if (sef_cpuid7_ebx.bits.rtm != 0)
- result |= CPU_RTM;
+ vm_features.set_feature(CPU_RTM);
if (sef_cpuid7_ebx.bits.adx != 0)
- result |= CPU_ADX;
+ vm_features.set_feature(CPU_ADX);
if (sef_cpuid7_ebx.bits.bmi2 != 0)
- result |= CPU_BMI2;
+ vm_features.set_feature(CPU_BMI2);
if (sef_cpuid7_ebx.bits.sha != 0)
- result |= CPU_SHA;
+ vm_features.set_feature(CPU_SHA);
if (std_cpuid1_ecx.bits.fma != 0)
- result |= CPU_FMA;
+ vm_features.set_feature(CPU_FMA);
if (sef_cpuid7_ebx.bits.clflushopt != 0)
- result |= CPU_FLUSHOPT;
+ vm_features.set_feature(CPU_FLUSHOPT);
+ if (sef_cpuid7_ebx.bits.clwb != 0)
+ vm_features.set_feature(CPU_CLWB);
if (ext_cpuid1_edx.bits.rdtscp != 0)
- result |= CPU_RDTSCP;
+ vm_features.set_feature(CPU_RDTSCP);
if (sef_cpuid7_ecx.bits.rdpid != 0)
- result |= CPU_RDPID;
+ vm_features.set_feature(CPU_RDPID);
- // AMD|Hygon features.
+ // AMD|Hygon additional features.
if (is_amd_family()) {
- if ((ext_cpuid1_edx.bits.tdnow != 0) ||
- (ext_cpuid1_ecx.bits.prefetchw != 0))
- result |= CPU_3DNOW_PREFETCH;
- if (ext_cpuid1_ecx.bits.lzcnt != 0)
- result |= CPU_LZCNT;
+ // PREFETCHW was checked above, check TDNOW here.
+ if ((ext_cpuid1_edx.bits.tdnow != 0))
+ vm_features.set_feature(CPU_3DNOW_PREFETCH);
if (ext_cpuid1_ecx.bits.sse4a != 0)
- result |= CPU_SSE4A;
+ vm_features.set_feature(CPU_SSE4A);
}
- // Intel features.
+ // Intel additional features.
if (is_intel()) {
- if (ext_cpuid1_ecx.bits.lzcnt != 0) {
- result |= CPU_LZCNT;
- }
- if (ext_cpuid1_ecx.bits.prefetchw != 0) {
- result |= CPU_3DNOW_PREFETCH;
- }
- if (sef_cpuid7_ebx.bits.clwb != 0) {
- result |= CPU_CLWB;
- }
if (sef_cpuid7_edx.bits.serialize != 0)
- result |= CPU_SERIALIZE;
-
+ vm_features.set_feature(CPU_SERIALIZE);
if (_cpuid_info.sef_cpuid7_edx.bits.avx512_fp16 != 0)
- result |= CPU_AVX512_FP16;
+ vm_features.set_feature(CPU_AVX512_FP16);
}
- // ZX features.
+ // ZX additional features.
if (is_zx()) {
- if (ext_cpuid1_ecx.bits.lzcnt != 0) {
- result |= CPU_LZCNT;
- }
- if (ext_cpuid1_ecx.bits.prefetchw != 0) {
- result |= CPU_3DNOW_PREFETCH;
- }
+ // We do not know if these are supported by ZX, so we cannot trust
+ // common CPUID bit for them.
+ assert(vm_features.supports_feature(CPU_CLWB), "Check if it is supported?");
+ vm_features.clear_feature(CPU_CLWB);
}
// Protection key features.
if (sef_cpuid7_ecx.bits.pku != 0) {
- result |= CPU_PKU;
+ vm_features.set_feature(CPU_PKU);
}
if (sef_cpuid7_ecx.bits.ospke != 0) {
- result |= CPU_OSPKE;
+ vm_features.set_feature(CPU_OSPKE);
}
// Control flow enforcement (CET) features.
if (sef_cpuid7_ecx.bits.cet_ss != 0) {
- result |= CPU_CET_SS;
+ vm_features.set_feature(CPU_CET_SS);
}
if (sef_cpuid7_edx.bits.cet_ibt != 0) {
- result |= CPU_CET_IBT;
+ vm_features.set_feature(CPU_CET_IBT);
}
// Composite features.
if (supports_tscinv_bit() &&
((is_amd_family() && !is_amd_Barcelona()) ||
is_intel_tsc_synched_at_init())) {
- result |= CPU_TSCINV;
+ vm_features.set_feature(CPU_TSCINV);
}
-
- return result;
+ return vm_features;
}
bool VM_Version::os_supports_avx_vectors() {
bool retVal = false;
- int nreg = 2 LP64_ONLY(+2);
+ int nreg = 4;
if (supports_evex()) {
// Verify that OS save/restore all bits of EVEX registers
// during signal processing.
@@ -3311,11 +3252,7 @@ int VM_Version::allocate_prefetch_distance(bool use_watermark_prefetch) {
if (supports_sse4_2() && supports_ht()) { // Nehalem based cpus
return 192;
} else if (use_watermark_prefetch) { // watermark prefetching on Core
-#ifdef _LP64
return 384;
-#else
- return 320;
-#endif
}
}
if (supports_sse2()) {
@@ -3344,3 +3281,14 @@ bool VM_Version::is_intrinsic_supported(vmIntrinsicID id) {
}
return true;
}
+
+void VM_Version::insert_features_names(VM_Version::VM_Features features, char* buf, size_t buflen) {
+ for (int i = 0; i < MAX_CPU_FEATURES; i++) {
+ if (features.supports_feature((VM_Version::Feature_Flag)i)) {
+ int res = jio_snprintf(buf, buflen, ", %s", _features_names[i]);
+ assert(res > 0, "not enough temporary space allocated");
+ buf += res;
+ buflen -= res;
+ }
+ }
+}
diff --git a/src/hotspot/cpu/x86/vm_version_x86.hpp b/src/hotspot/cpu/x86/vm_version_x86.hpp
index cc5c6c1c63992..a544eeb71b8cb 100644
--- a/src/hotspot/cpu/x86/vm_version_x86.hpp
+++ b/src/hotspot/cpu/x86/vm_version_x86.hpp
@@ -295,12 +295,32 @@ class VM_Version : public Abstract_VM_Version {
union SefCpuid7SubLeaf1Edx {
uint32_t value;
struct {
- uint32_t : 21,
+ uint32_t : 19,
+ avx10 : 1,
+ : 1,
apx_f : 1,
: 10;
} bits;
};
+ union StdCpuid24MainLeafEax {
+ uint32_t value;
+ struct {
+ uint32_t sub_leaves_cnt : 31;
+ } bits;
+ };
+
+ union StdCpuid24MainLeafEbx {
+ uint32_t value;
+ struct {
+ uint32_t avx10_converged_isa_version : 8,
+ : 8,
+ : 2,
+ avx10_vlen_512 : 1,
+ : 13;
+ } bits;
+ };
+
union ExtCpuid1EEbx {
uint32_t value;
struct {
@@ -342,9 +362,9 @@ class VM_Version : public Abstract_VM_Version {
/*
* Update following files when declaring new flags:
* test/lib-test/jdk/test/whitebox/CPUInfoTest.java
- * src/jdk.internal.vm.ci/share/classes/jdk.vm.ci.amd64/src/jdk/vm/ci/amd64/AMD64.java
+ * src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/amd64/AMD64.java
*/
- enum Feature_Flag : uint64_t {
+ enum Feature_Flag {
#define CPU_FEATURE_FLAGS(decl) \
decl(CX8, "cx8", 0) /* next bits are from cpuid 1 (EDX) */ \
decl(CMOV, "cmov", 1) \
@@ -420,15 +440,85 @@ class VM_Version : public Abstract_VM_Version {
decl(AVX_IFMA, "avx_ifma", 59) /* 256-bit VEX-coded variant of AVX512-IFMA*/ \
decl(APX_F, "apx_f", 60) /* Intel Advanced Performance Extensions*/ \
decl(SHA512, "sha512", 61) /* SHA512 instructions*/ \
- decl(AVX512_FP16, "avx512_fp16", 62) /* AVX512 FP16 ISA support*/
+ decl(AVX512_FP16, "avx512_fp16", 62) /* AVX512 FP16 ISA support*/ \
+ decl(AVX10_1, "avx10_1", 63) /* AVX10 512 bit vector ISA Version 1 support*/ \
+ decl(AVX10_2, "avx10_2", 64) /* AVX10 512 bit vector ISA Version 2 support*/
-#define DECLARE_CPU_FEATURE_FLAG(id, name, bit) CPU_##id = (1ULL << bit),
+#define DECLARE_CPU_FEATURE_FLAG(id, name, bit) CPU_##id = (bit),
CPU_FEATURE_FLAGS(DECLARE_CPU_FEATURE_FLAG)
#undef DECLARE_CPU_FEATURE_FLAG
+ MAX_CPU_FEATURES
};
+ class VM_Features {
+ friend class VMStructs;
+ friend class JVMCIVMStructs;
+
+ private:
+ uint64_t _features_bitmap[(MAX_CPU_FEATURES / BitsPerLong) + 1];
+
+ STATIC_ASSERT(sizeof(_features_bitmap) * BitsPerByte >= MAX_CPU_FEATURES);
+
+ // Number of 8-byte elements in _bitmap.
+ constexpr static int features_bitmap_element_count() {
+ return sizeof(_features_bitmap) / sizeof(uint64_t);
+ }
+
+ constexpr static int features_bitmap_element_shift_count() {
+ return LogBitsPerLong;
+ }
+
+ constexpr static uint64_t features_bitmap_element_mask() {
+ return (1ULL << features_bitmap_element_shift_count()) - 1;
+ }
+
+ static int index(Feature_Flag feature) {
+ int idx = feature >> features_bitmap_element_shift_count();
+ assert(idx < features_bitmap_element_count(), "Features array index out of bounds");
+ return idx;
+ }
+
+ static uint64_t bit_mask(Feature_Flag feature) {
+ return (1ULL << (feature & features_bitmap_element_mask()));
+ }
+
+ static int _features_bitmap_size; // for JVMCI purposes
+ public:
+ VM_Features() {
+ for (int i = 0; i < features_bitmap_element_count(); i++) {
+ _features_bitmap[i] = 0;
+ }
+ }
+
+ void set_feature(Feature_Flag feature) {
+ int idx = index(feature);
+ _features_bitmap[idx] |= bit_mask(feature);
+ }
+
+ void clear_feature(VM_Version::Feature_Flag feature) {
+ int idx = index(feature);
+ _features_bitmap[idx] &= ~bit_mask(feature);
+ }
+
+ bool supports_feature(VM_Version::Feature_Flag feature) {
+ int idx = index(feature);
+ return (_features_bitmap[idx] & bit_mask(feature)) != 0;
+ }
+ };
+
+ // CPU feature flags vector, can be affected by VM settings.
+ static VM_Features _features;
+
+ // Original CPU feature flags vector, not affected by VM settings.
+ static VM_Features _cpu_features;
+
static const char* _features_names[];
+ static void clear_cpu_features() {
+ _features = VM_Features();
+ _cpu_features = VM_Features();
+ }
+
enum Extended_Family {
// AMD
CPU_FAMILY_AMD_11H = 0x11,
@@ -492,6 +582,11 @@ class VM_Version : public Abstract_VM_Version {
SefCpuid7SubLeaf1Eax sefsl1_cpuid7_eax;
SefCpuid7SubLeaf1Edx sefsl1_cpuid7_edx;
+ // cpuid function 24 converged vector ISA main leaf
+ // eax = 24, ecx = 0
+ StdCpuid24MainLeafEax std_cpuid24_eax;
+ StdCpuid24MainLeafEbx std_cpuid24_ebx;
+
// cpuid function 0xB (processor topology)
// ecx = 0
uint32_t tpl_cpuidB0_eax;
@@ -565,7 +660,7 @@ class VM_Version : public Abstract_VM_Version {
// Space to save apx registers after signal handle
jlong apx_save[2]; // Save r16 and r31
- uint64_t feature_flags() const;
+ VM_Features feature_flags() const;
// Asserts
void assert_is_initialized() const {
@@ -611,6 +706,7 @@ class VM_Version : public Abstract_VM_Version {
// Offsets for cpuid asm stub
static ByteSize std_cpuid0_offset() { return byte_offset_of(CpuidInfo, std_max_function); }
static ByteSize std_cpuid1_offset() { return byte_offset_of(CpuidInfo, std_cpuid1_eax); }
+ static ByteSize std_cpuid24_offset() { return byte_offset_of(CpuidInfo, std_cpuid24_eax); }
static ByteSize dcp_cpuid4_offset() { return byte_offset_of(CpuidInfo, dcp_cpuid4_eax); }
static ByteSize sef_cpuid7_offset() { return byte_offset_of(CpuidInfo, sef_cpuid7_eax); }
static ByteSize sefsl1_cpuid7_offset() { return byte_offset_of(CpuidInfo, sefsl1_cpuid7_eax); }
@@ -642,13 +738,31 @@ class VM_Version : public Abstract_VM_Version {
static void set_cpuinfo_cont_addr_apx(address pc) { _cpuinfo_cont_addr_apx = pc; }
static address cpuinfo_cont_addr_apx() { return _cpuinfo_cont_addr_apx; }
- LP64_ONLY(static void clear_apx_test_state());
+ static void clear_apx_test_state();
- static void clean_cpuFeatures() { _features = 0; }
- static void set_avx_cpuFeatures() { _features |= (CPU_SSE | CPU_SSE2 | CPU_AVX | CPU_VZEROUPPER ); }
- static void set_evex_cpuFeatures() { _features |= (CPU_AVX512F | CPU_SSE | CPU_SSE2 | CPU_VZEROUPPER ); }
- static void set_apx_cpuFeatures() { _features |= CPU_APX_F; }
- static void set_bmi_cpuFeatures() { _features |= (CPU_BMI1 | CPU_BMI2 | CPU_LZCNT | CPU_POPCNT); }
+ static void clean_cpuFeatures() {
+ VM_Version::clear_cpu_features();
+ }
+ static void set_avx_cpuFeatures() {
+ _features.set_feature(CPU_SSE);
+ _features.set_feature(CPU_SSE2);
+ _features.set_feature(CPU_AVX);
+ _features.set_feature(CPU_VZEROUPPER);
+ }
+ static void set_evex_cpuFeatures() {
+ _features.set_feature(CPU_AVX10_1);
+ _features.set_feature(CPU_AVX512F);
+ _features.set_feature(CPU_SSE);
+ _features.set_feature(CPU_SSE2);
+ _features.set_feature(CPU_VZEROUPPER);
+ }
+ static void set_apx_cpuFeatures() { _features.set_feature(CPU_APX_F); }
+ static void set_bmi_cpuFeatures() {
+ _features.set_feature(CPU_BMI1);
+ _features.set_feature(CPU_BMI2);
+ _features.set_feature(CPU_LZCNT);
+ _features.set_feature(CPU_POPCNT);
+ }
// Initialization
static void initialize();
@@ -703,40 +817,39 @@ class VM_Version : public Abstract_VM_Version {
//
// Feature identification which can be affected by VM settings
//
- static bool supports_cpuid() { return _features != 0; }
- static bool supports_cmov() { return (_features & CPU_CMOV) != 0; }
- static bool supports_fxsr() { return (_features & CPU_FXSR) != 0; }
- static bool supports_ht() { return (_features & CPU_HT) != 0; }
- static bool supports_mmx() { return (_features & CPU_MMX) != 0; }
- static bool supports_sse() { return (_features & CPU_SSE) != 0; }
- static bool supports_sse2() { return (_features & CPU_SSE2) != 0; }
- static bool supports_sse3() { return (_features & CPU_SSE3) != 0; }
- static bool supports_ssse3() { return (_features & CPU_SSSE3)!= 0; }
- static bool supports_sse4_1() { return (_features & CPU_SSE4_1) != 0; }
- static bool supports_sse4_2() { return (_features & CPU_SSE4_2) != 0; }
- static bool supports_popcnt() { return (_features & CPU_POPCNT) != 0; }
- static bool supports_avx() { return (_features & CPU_AVX) != 0; }
- static bool supports_avx2() { return (_features & CPU_AVX2) != 0; }
- static bool supports_tsc() { return (_features & CPU_TSC) != 0; }
- static bool supports_rdtscp() { return (_features & CPU_RDTSCP) != 0; }
- static bool supports_rdpid() { return (_features & CPU_RDPID) != 0; }
- static bool supports_aes() { return (_features & CPU_AES) != 0; }
- static bool supports_erms() { return (_features & CPU_ERMS) != 0; }
- static bool supports_fsrm() { return (_features & CPU_FSRM) != 0; }
- static bool supports_clmul() { return (_features & CPU_CLMUL) != 0; }
- static bool supports_rtm() { return (_features & CPU_RTM) != 0; }
- static bool supports_bmi1() { return (_features & CPU_BMI1) != 0; }
- static bool supports_bmi2() { return (_features & CPU_BMI2) != 0; }
- static bool supports_adx() { return (_features & CPU_ADX) != 0; }
- static bool supports_evex() { return (_features & CPU_AVX512F) != 0; }
- static bool supports_avx512dq() { return (_features & CPU_AVX512DQ) != 0; }
- static bool supports_avx512ifma() { return (_features & CPU_AVX512_IFMA) != 0; }
- static bool supports_avxifma() { return (_features & CPU_AVX_IFMA) != 0; }
- static bool supports_avx512pf() { return (_features & CPU_AVX512PF) != 0; }
- static bool supports_avx512er() { return (_features & CPU_AVX512ER) != 0; }
- static bool supports_avx512cd() { return (_features & CPU_AVX512CD) != 0; }
- static bool supports_avx512bw() { return (_features & CPU_AVX512BW) != 0; }
- static bool supports_avx512vl() { return (_features & CPU_AVX512VL) != 0; }
+ static bool supports_cmov() { return _features.supports_feature(CPU_CMOV); }
+ static bool supports_fxsr() { return _features.supports_feature(CPU_FXSR); }
+ static bool supports_ht() { return _features.supports_feature(CPU_HT); }
+ static bool supports_mmx() { return _features.supports_feature(CPU_MMX); }
+ static bool supports_sse() { return _features.supports_feature(CPU_SSE); }
+ static bool supports_sse2() { return _features.supports_feature(CPU_SSE2); }
+ static bool supports_sse3() { return _features.supports_feature(CPU_SSE3); }
+ static bool supports_ssse3() { return _features.supports_feature(CPU_SSSE3); }
+ static bool supports_sse4_1() { return _features.supports_feature(CPU_SSE4_1); }
+ static bool supports_sse4_2() { return _features.supports_feature(CPU_SSE4_2); }
+ static bool supports_popcnt() { return _features.supports_feature(CPU_POPCNT); }
+ static bool supports_avx() { return _features.supports_feature(CPU_AVX); }
+ static bool supports_avx2() { return _features.supports_feature(CPU_AVX2); }
+ static bool supports_tsc() { return _features.supports_feature(CPU_TSC); }
+ static bool supports_rdtscp() { return _features.supports_feature(CPU_RDTSCP); }
+ static bool supports_rdpid() { return _features.supports_feature(CPU_RDPID); }
+ static bool supports_aes() { return _features.supports_feature(CPU_AES); }
+ static bool supports_erms() { return _features.supports_feature(CPU_ERMS); }
+ static bool supports_fsrm() { return _features.supports_feature(CPU_FSRM); }
+ static bool supports_clmul() { return _features.supports_feature(CPU_CLMUL); }
+ static bool supports_rtm() { return _features.supports_feature(CPU_RTM); }
+ static bool supports_bmi1() { return _features.supports_feature(CPU_BMI1); }
+ static bool supports_bmi2() { return _features.supports_feature(CPU_BMI2); }
+ static bool supports_adx() { return _features.supports_feature(CPU_ADX); }
+ static bool supports_evex() { return _features.supports_feature(CPU_AVX512F); }
+ static bool supports_avx512dq() { return _features.supports_feature(CPU_AVX512DQ); }
+ static bool supports_avx512ifma() { return _features.supports_feature(CPU_AVX512_IFMA); }
+ static bool supports_avxifma() { return _features.supports_feature(CPU_AVX_IFMA); }
+ static bool supports_avx512pf() { return _features.supports_feature(CPU_AVX512PF); }
+ static bool supports_avx512er() { return _features.supports_feature(CPU_AVX512ER); }
+ static bool supports_avx512cd() { return _features.supports_feature(CPU_AVX512CD); }
+ static bool supports_avx512bw() { return _features.supports_feature(CPU_AVX512BW); }
+ static bool supports_avx512vl() { return _features.supports_feature(CPU_AVX512VL); }
static bool supports_avx512vlbw() { return (supports_evex() && supports_avx512bw() && supports_avx512vl()); }
static bool supports_avx512bwdq() { return (supports_evex() && supports_avx512bw() && supports_avx512dq()); }
static bool supports_avx512vldq() { return (supports_evex() && supports_avx512dq() && supports_avx512vl()); }
@@ -745,33 +858,39 @@ class VM_Version : public Abstract_VM_Version {
static bool supports_avx512novl() { return (supports_evex() && !supports_avx512vl()); }
static bool supports_avx512nobw() { return (supports_evex() && !supports_avx512bw()); }
static bool supports_avx256only() { return (supports_avx2() && !supports_evex()); }
- static bool supports_apx_f() { return (_features & CPU_APX_F) != 0; }
+ static bool supports_apx_f() { return _features.supports_feature(CPU_APX_F); }
static bool supports_avxonly() { return ((supports_avx2() || supports_avx()) && !supports_evex()); }
- static bool supports_sha() { return (_features & CPU_SHA) != 0; }
- static bool supports_fma() { return (_features & CPU_FMA) != 0 && supports_avx(); }
- static bool supports_vzeroupper() { return (_features & CPU_VZEROUPPER) != 0; }
- static bool supports_avx512_vpopcntdq() { return (_features & CPU_AVX512_VPOPCNTDQ) != 0; }
- static bool supports_avx512_vpclmulqdq() { return (_features & CPU_AVX512_VPCLMULQDQ) != 0; }
- static bool supports_avx512_vaes() { return (_features & CPU_AVX512_VAES) != 0; }
- static bool supports_gfni() { return (_features & CPU_GFNI) != 0; }
- static bool supports_avx512_vnni() { return (_features & CPU_AVX512_VNNI) != 0; }
- static bool supports_avx512_bitalg() { return (_features & CPU_AVX512_BITALG) != 0; }
- static bool supports_avx512_vbmi() { return (_features & CPU_AVX512_VBMI) != 0; }
- static bool supports_avx512_vbmi2() { return (_features & CPU_AVX512_VBMI2) != 0; }
- static bool supports_avx512_fp16() { return (_features & CPU_AVX512_FP16) != 0; }
- static bool supports_hv() { return (_features & CPU_HV) != 0; }
- static bool supports_serialize() { return (_features & CPU_SERIALIZE) != 0; }
- static bool supports_f16c() { return (_features & CPU_F16C) != 0; }
- static bool supports_pku() { return (_features & CPU_PKU) != 0; }
- static bool supports_ospke() { return (_features & CPU_OSPKE) != 0; }
- static bool supports_cet_ss() { return (_features & CPU_CET_SS) != 0; }
- static bool supports_cet_ibt() { return (_features & CPU_CET_IBT) != 0; }
- static bool supports_sha512() { return (_features & CPU_SHA512) != 0; }
+ static bool supports_sha() { return _features.supports_feature(CPU_SHA); }
+ static bool supports_fma() { return _features.supports_feature(CPU_FMA) && supports_avx(); }
+ static bool supports_vzeroupper() { return _features.supports_feature(CPU_VZEROUPPER); }
+ static bool supports_avx512_vpopcntdq() { return _features.supports_feature(CPU_AVX512_VPOPCNTDQ); }
+ static bool supports_avx512_vpclmulqdq() { return _features.supports_feature(CPU_AVX512_VPCLMULQDQ); }
+ static bool supports_avx512_vaes() { return _features.supports_feature(CPU_AVX512_VAES); }
+ static bool supports_gfni() { return _features.supports_feature(CPU_GFNI); }
+ static bool supports_avx512_vnni() { return _features.supports_feature(CPU_AVX512_VNNI); }
+ static bool supports_avx512_bitalg() { return _features.supports_feature(CPU_AVX512_BITALG); }
+ static bool supports_avx512_vbmi() { return _features.supports_feature(CPU_AVX512_VBMI); }
+ static bool supports_avx512_vbmi2() { return _features.supports_feature(CPU_AVX512_VBMI2); }
+ static bool supports_avx512_fp16() { return _features.supports_feature(CPU_AVX512_FP16); }
+ static bool supports_hv() { return _features.supports_feature(CPU_HV); }
+ static bool supports_serialize() { return _features.supports_feature(CPU_SERIALIZE); }
+ static bool supports_f16c() { return _features.supports_feature(CPU_F16C); }
+ static bool supports_pku() { return _features.supports_feature(CPU_PKU); }
+ static bool supports_ospke() { return _features.supports_feature(CPU_OSPKE); }
+ static bool supports_cet_ss() { return _features.supports_feature(CPU_CET_SS); }
+ static bool supports_cet_ibt() { return _features.supports_feature(CPU_CET_IBT); }
+ static bool supports_sha512() { return _features.supports_feature(CPU_SHA512); }
+
+ // Intel® AVX10 introduces a versioned approach for enumeration that is monotonically increasing, inclusive,
+ // and supporting all vector lengths. Feature set supported by an AVX10 vector ISA version is also supported
+ // by all the versions above it.
+ static bool supports_avx10_1() { return _features.supports_feature(CPU_AVX10_1);}
+ static bool supports_avx10_2() { return _features.supports_feature(CPU_AVX10_2);}
//
// Feature identification not affected by VM flags
//
- static bool cpu_supports_evex() { return (_cpu_features & CPU_AVX512F) != 0; }
+ static bool cpu_supports_evex() { return _cpu_features.supports_feature(CPU_AVX512F); }
static bool supports_avx512_simd_sort() {
if (supports_avx512dq()) {
@@ -802,6 +921,8 @@ class VM_Version : public Abstract_VM_Version {
static bool is_intel_tsc_synched_at_init();
+ static void insert_features_names(VM_Version::VM_Features features, char* buf, size_t buflen);
+
// This checks if the JVM is potentially affected by an erratum on Intel CPUs (SKX102)
// that causes unpredictable behaviour when jcc crosses 64 byte boundaries. Its microcode
// mitigation causes regressions when jumps or fused conditional branches cross or end at
@@ -809,19 +930,19 @@ class VM_Version : public Abstract_VM_Version {
static bool has_intel_jcc_erratum() { return _has_intel_jcc_erratum; }
// AMD features
- static bool supports_3dnow_prefetch() { return (_features & CPU_3DNOW_PREFETCH) != 0; }
- static bool supports_lzcnt() { return (_features & CPU_LZCNT) != 0; }
- static bool supports_sse4a() { return (_features & CPU_SSE4A) != 0; }
+ static bool supports_3dnow_prefetch() { return _features.supports_feature(CPU_3DNOW_PREFETCH); }
+ static bool supports_lzcnt() { return _features.supports_feature(CPU_LZCNT); }
+ static bool supports_sse4a() { return _features.supports_feature(CPU_SSE4A); }
static bool is_amd_Barcelona() { return is_amd() &&
extended_cpu_family() == CPU_FAMILY_AMD_11H; }
// Intel and AMD newer cores support fast timestamps well
static bool supports_tscinv_bit() {
- return (_features & CPU_TSCINV_BIT) != 0;
+ return _features.supports_feature(CPU_TSCINV_BIT);
}
static bool supports_tscinv() {
- return (_features & CPU_TSCINV) != 0;
+ return _features.supports_feature(CPU_TSCINV);
}
// Intel Core and newer cpus have fast IDIV instruction (excluding Atom).
@@ -839,12 +960,12 @@ class VM_Version : public Abstract_VM_Version {
// x86_64 supports fast class initialization checks
static bool supports_fast_class_init_checks() {
- return LP64_ONLY(true) NOT_LP64(false); // not implemented on x86_32
+ return true;
}
// x86_64 supports secondary supers table
constexpr static bool supports_secondary_supers_table() {
- return LP64_ONLY(true) NOT_LP64(false); // not implemented on x86_32
+ return true;
}
constexpr static bool supports_stack_watermark_barrier() {
@@ -879,15 +1000,11 @@ class VM_Version : public Abstract_VM_Version {
// synchronize with other memory ops. so, it needs preceding
// and trailing StoreStore fences.
-#ifdef _LP64
static bool supports_clflush(); // Can't inline due to header file conflict
-#else
- static bool supports_clflush() { return ((_features & CPU_FLUSH) != 0); }
-#endif // _LP64
// Note: CPU_FLUSHOPT and CPU_CLWB bits should always be zero for 32-bit
- static bool supports_clflushopt() { return ((_features & CPU_FLUSHOPT) != 0); }
- static bool supports_clwb() { return ((_features & CPU_CLWB) != 0); }
+ static bool supports_clflushopt() { return (_features.supports_feature(CPU_FLUSHOPT)); }
+ static bool supports_clwb() { return (_features.supports_feature(CPU_CLWB)); }
// Old CPUs perform lea on AGU which causes additional latency transferring the
// value from/to ALU for other operations
diff --git a/src/hotspot/cpu/x86/x86.ad b/src/hotspot/cpu/x86/x86.ad
index afa1a92287d99..a281331cb2986 100644
--- a/src/hotspot/cpu/x86/x86.ad
+++ b/src/hotspot/cpu/x86/x86.ad
@@ -210,8 +210,6 @@ reg_def XMM7n( SOC, SOC, Op_RegF, 7, xmm7->as_VMReg()->next(13));
reg_def XMM7o( SOC, SOC, Op_RegF, 7, xmm7->as_VMReg()->next(14));
reg_def XMM7p( SOC, SOC, Op_RegF, 7, xmm7->as_VMReg()->next(15));
-#ifdef _LP64
-
reg_def XMM8 ( SOC, SOC, Op_RegF, 8, xmm8->as_VMReg());
reg_def XMM8b( SOC, SOC, Op_RegF, 8, xmm8->as_VMReg()->next(1));
reg_def XMM8c( SOC, SOC, Op_RegF, 8, xmm8->as_VMReg()->next(2));
@@ -620,13 +618,7 @@ reg_def XMM31n( SOC, SOC, Op_RegF, 31, xmm31->as_VMReg()->next(13));
reg_def XMM31o( SOC, SOC, Op_RegF, 31, xmm31->as_VMReg()->next(14));
reg_def XMM31p( SOC, SOC, Op_RegF, 31, xmm31->as_VMReg()->next(15));
-#endif // _LP64
-
-#ifdef _LP64
reg_def RFLAGS(SOC, SOC, 0, 16, VMRegImpl::Bad());
-#else
-reg_def RFLAGS(SOC, SOC, 0, 8, VMRegImpl::Bad());
-#endif // _LP64
// AVX3 Mask Registers.
reg_def K1 (SOC, SOC, Op_RegI, 1, k1->as_VMReg());
@@ -658,17 +650,16 @@ alloc_class chunk1(XMM0, XMM0b, XMM0c, XMM0d, XMM0e, XMM0f, XMM0g, XMM0h,
XMM4, XMM4b, XMM4c, XMM4d, XMM4e, XMM4f, XMM4g, XMM4h, XMM4i, XMM4j, XMM4k, XMM4l, XMM4m, XMM4n, XMM4o, XMM4p,
XMM5, XMM5b, XMM5c, XMM5d, XMM5e, XMM5f, XMM5g, XMM5h, XMM5i, XMM5j, XMM5k, XMM5l, XMM5m, XMM5n, XMM5o, XMM5p,
XMM6, XMM6b, XMM6c, XMM6d, XMM6e, XMM6f, XMM6g, XMM6h, XMM6i, XMM6j, XMM6k, XMM6l, XMM6m, XMM6n, XMM6o, XMM6p,
- XMM7, XMM7b, XMM7c, XMM7d, XMM7e, XMM7f, XMM7g, XMM7h, XMM7i, XMM7j, XMM7k, XMM7l, XMM7m, XMM7n, XMM7o, XMM7p
-#ifdef _LP64
- ,XMM8, XMM8b, XMM8c, XMM8d, XMM8e, XMM8f, XMM8g, XMM8h, XMM8i, XMM8j, XMM8k, XMM8l, XMM8m, XMM8n, XMM8o, XMM8p,
+ XMM7, XMM7b, XMM7c, XMM7d, XMM7e, XMM7f, XMM7g, XMM7h, XMM7i, XMM7j, XMM7k, XMM7l, XMM7m, XMM7n, XMM7o, XMM7p,
+ XMM8, XMM8b, XMM8c, XMM8d, XMM8e, XMM8f, XMM8g, XMM8h, XMM8i, XMM8j, XMM8k, XMM8l, XMM8m, XMM8n, XMM8o, XMM8p,
XMM9, XMM9b, XMM9c, XMM9d, XMM9e, XMM9f, XMM9g, XMM9h, XMM9i, XMM9j, XMM9k, XMM9l, XMM9m, XMM9n, XMM9o, XMM9p,
XMM10, XMM10b, XMM10c, XMM10d, XMM10e, XMM10f, XMM10g, XMM10h, XMM10i, XMM10j, XMM10k, XMM10l, XMM10m, XMM10n, XMM10o, XMM10p,
XMM11, XMM11b, XMM11c, XMM11d, XMM11e, XMM11f, XMM11g, XMM11h, XMM11i, XMM11j, XMM11k, XMM11l, XMM11m, XMM11n, XMM11o, XMM11p,
XMM12, XMM12b, XMM12c, XMM12d, XMM12e, XMM12f, XMM12g, XMM12h, XMM12i, XMM12j, XMM12k, XMM12l, XMM12m, XMM12n, XMM12o, XMM12p,
XMM13, XMM13b, XMM13c, XMM13d, XMM13e, XMM13f, XMM13g, XMM13h, XMM13i, XMM13j, XMM13k, XMM13l, XMM13m, XMM13n, XMM13o, XMM13p,
XMM14, XMM14b, XMM14c, XMM14d, XMM14e, XMM14f, XMM14g, XMM14h, XMM14i, XMM14j, XMM14k, XMM14l, XMM14m, XMM14n, XMM14o, XMM14p,
- XMM15, XMM15b, XMM15c, XMM15d, XMM15e, XMM15f, XMM15g, XMM15h, XMM15i, XMM15j, XMM15k, XMM15l, XMM15m, XMM15n, XMM15o, XMM15p
- ,XMM16, XMM16b, XMM16c, XMM16d, XMM16e, XMM16f, XMM16g, XMM16h, XMM16i, XMM16j, XMM16k, XMM16l, XMM16m, XMM16n, XMM16o, XMM16p,
+ XMM15, XMM15b, XMM15c, XMM15d, XMM15e, XMM15f, XMM15g, XMM15h, XMM15i, XMM15j, XMM15k, XMM15l, XMM15m, XMM15n, XMM15o, XMM15p,
+ XMM16, XMM16b, XMM16c, XMM16d, XMM16e, XMM16f, XMM16g, XMM16h, XMM16i, XMM16j, XMM16k, XMM16l, XMM16m, XMM16n, XMM16o, XMM16p,
XMM17, XMM17b, XMM17c, XMM17d, XMM17e, XMM17f, XMM17g, XMM17h, XMM17i, XMM17j, XMM17k, XMM17l, XMM17m, XMM17n, XMM17o, XMM17p,
XMM18, XMM18b, XMM18c, XMM18d, XMM18e, XMM18f, XMM18g, XMM18h, XMM18i, XMM18j, XMM18k, XMM18l, XMM18m, XMM18n, XMM18o, XMM18p,
XMM19, XMM19b, XMM19c, XMM19d, XMM19e, XMM19f, XMM19g, XMM19h, XMM19i, XMM19j, XMM19k, XMM19l, XMM19m, XMM19n, XMM19o, XMM19p,
@@ -683,9 +674,7 @@ alloc_class chunk1(XMM0, XMM0b, XMM0c, XMM0d, XMM0e, XMM0f, XMM0g, XMM0h,
XMM28, XMM28b, XMM28c, XMM28d, XMM28e, XMM28f, XMM28g, XMM28h, XMM28i, XMM28j, XMM28k, XMM28l, XMM28m, XMM28n, XMM28o, XMM28p,
XMM29, XMM29b, XMM29c, XMM29d, XMM29e, XMM29f, XMM29g, XMM29h, XMM29i, XMM29j, XMM29k, XMM29l, XMM29m, XMM29n, XMM29o, XMM29p,
XMM30, XMM30b, XMM30c, XMM30d, XMM30e, XMM30f, XMM30g, XMM30h, XMM30i, XMM30j, XMM30k, XMM30l, XMM30m, XMM30n, XMM30o, XMM30p,
- XMM31, XMM31b, XMM31c, XMM31d, XMM31e, XMM31f, XMM31g, XMM31h, XMM31i, XMM31j, XMM31k, XMM31l, XMM31m, XMM31n, XMM31o, XMM31p
-#endif
- );
+ XMM31, XMM31b, XMM31c, XMM31d, XMM31e, XMM31f, XMM31g, XMM31h, XMM31i, XMM31j, XMM31k, XMM31l, XMM31m, XMM31n, XMM31o, XMM31p);
alloc_class chunk2(K7, K7_H,
K6, K6_H,
@@ -726,18 +715,15 @@ reg_class float_reg_legacy(XMM0,
XMM4,
XMM5,
XMM6,
- XMM7
-#ifdef _LP64
- ,XMM8,
+ XMM7,
+ XMM8,
XMM9,
XMM10,
XMM11,
XMM12,
XMM13,
XMM14,
- XMM15
-#endif
- );
+ XMM15);
// Class for evex float registers
reg_class float_reg_evex(XMM0,
@@ -747,9 +733,8 @@ reg_class float_reg_evex(XMM0,
XMM4,
XMM5,
XMM6,
- XMM7
-#ifdef _LP64
- ,XMM8,
+ XMM7,
+ XMM8,
XMM9,
XMM10,
XMM11,
@@ -772,9 +757,7 @@ reg_class float_reg_evex(XMM0,
XMM28,
XMM29,
XMM30,
- XMM31
-#endif
- );
+ XMM31);
reg_class_dynamic float_reg(float_reg_evex, float_reg_legacy, %{ VM_Version::supports_evex() %} );
reg_class_dynamic float_reg_vl(float_reg_evex, float_reg_legacy, %{ VM_Version::supports_evex() && VM_Version::supports_avx512vl() %} );
@@ -787,18 +770,15 @@ reg_class double_reg_legacy(XMM0, XMM0b,
XMM4, XMM4b,
XMM5, XMM5b,
XMM6, XMM6b,
- XMM7, XMM7b
-#ifdef _LP64
- ,XMM8, XMM8b,
+ XMM7, XMM7b,
+ XMM8, XMM8b,
XMM9, XMM9b,
XMM10, XMM10b,
XMM11, XMM11b,
XMM12, XMM12b,
XMM13, XMM13b,
XMM14, XMM14b,
- XMM15, XMM15b
-#endif
- );
+ XMM15, XMM15b);
// Class for evex double registers
reg_class double_reg_evex(XMM0, XMM0b,
@@ -808,9 +788,8 @@ reg_class double_reg_evex(XMM0, XMM0b,
XMM4, XMM4b,
XMM5, XMM5b,
XMM6, XMM6b,
- XMM7, XMM7b
-#ifdef _LP64
- ,XMM8, XMM8b,
+ XMM7, XMM7b,
+ XMM8, XMM8b,
XMM9, XMM9b,
XMM10, XMM10b,
XMM11, XMM11b,
@@ -833,9 +812,7 @@ reg_class double_reg_evex(XMM0, XMM0b,
XMM28, XMM28b,
XMM29, XMM29b,
XMM30, XMM30b,
- XMM31, XMM31b
-#endif
- );
+ XMM31, XMM31b);
reg_class_dynamic double_reg(double_reg_evex, double_reg_legacy, %{ VM_Version::supports_evex() %} );
reg_class_dynamic double_reg_vl(double_reg_evex, double_reg_legacy, %{ VM_Version::supports_evex() && VM_Version::supports_avx512vl() %} );
@@ -848,18 +825,15 @@ reg_class vectors_reg_legacy(XMM0,
XMM4,
XMM5,
XMM6,
- XMM7
-#ifdef _LP64
- ,XMM8,
+ XMM7,
+ XMM8,
XMM9,
XMM10,
XMM11,
XMM12,
XMM13,
XMM14,
- XMM15
-#endif
- );
+ XMM15);
// Class for evex 32bit vector registers
reg_class vectors_reg_evex(XMM0,
@@ -869,9 +843,8 @@ reg_class vectors_reg_evex(XMM0,
XMM4,
XMM5,
XMM6,
- XMM7
-#ifdef _LP64
- ,XMM8,
+ XMM7,
+ XMM8,
XMM9,
XMM10,
XMM11,
@@ -894,9 +867,7 @@ reg_class vectors_reg_evex(XMM0,
XMM28,
XMM29,
XMM30,
- XMM31
-#endif
- );
+ XMM31);
reg_class_dynamic vectors_reg(vectors_reg_evex, vectors_reg_legacy, %{ VM_Version::supports_evex() %} );
reg_class_dynamic vectors_reg_vlbwdq(vectors_reg_evex, vectors_reg_legacy, %{ VM_Version::supports_avx512vlbwdq() %} );
@@ -909,18 +880,15 @@ reg_class vectord_reg_legacy(XMM0, XMM0b,
XMM4, XMM4b,
XMM5, XMM5b,
XMM6, XMM6b,
- XMM7, XMM7b
-#ifdef _LP64
- ,XMM8, XMM8b,
+ XMM7, XMM7b,
+ XMM8, XMM8b,
XMM9, XMM9b,
XMM10, XMM10b,
XMM11, XMM11b,
XMM12, XMM12b,
XMM13, XMM13b,
XMM14, XMM14b,
- XMM15, XMM15b
-#endif
- );
+ XMM15, XMM15b);
// Class for all 64bit vector registers
reg_class vectord_reg_evex(XMM0, XMM0b,
@@ -930,9 +898,8 @@ reg_class vectord_reg_evex(XMM0, XMM0b,
XMM4, XMM4b,
XMM5, XMM5b,
XMM6, XMM6b,
- XMM7, XMM7b
-#ifdef _LP64
- ,XMM8, XMM8b,
+ XMM7, XMM7b,
+ XMM8, XMM8b,
XMM9, XMM9b,
XMM10, XMM10b,
XMM11, XMM11b,
@@ -955,9 +922,7 @@ reg_class vectord_reg_evex(XMM0, XMM0b,
XMM28, XMM28b,
XMM29, XMM29b,
XMM30, XMM30b,
- XMM31, XMM31b
-#endif
- );
+ XMM31, XMM31b);
reg_class_dynamic vectord_reg(vectord_reg_evex, vectord_reg_legacy, %{ VM_Version::supports_evex() %} );
reg_class_dynamic vectord_reg_vlbwdq(vectord_reg_evex, vectord_reg_legacy, %{ VM_Version::supports_avx512vlbwdq() %} );
@@ -970,18 +935,15 @@ reg_class vectorx_reg_legacy(XMM0, XMM0b, XMM0c, XMM0d,
XMM4, XMM4b, XMM4c, XMM4d,
XMM5, XMM5b, XMM5c, XMM5d,
XMM6, XMM6b, XMM6c, XMM6d,
- XMM7, XMM7b, XMM7c, XMM7d
-#ifdef _LP64
- ,XMM8, XMM8b, XMM8c, XMM8d,
+ XMM7, XMM7b, XMM7c, XMM7d,
+ XMM8, XMM8b, XMM8c, XMM8d,
XMM9, XMM9b, XMM9c, XMM9d,
XMM10, XMM10b, XMM10c, XMM10d,
XMM11, XMM11b, XMM11c, XMM11d,
XMM12, XMM12b, XMM12c, XMM12d,
XMM13, XMM13b, XMM13c, XMM13d,
XMM14, XMM14b, XMM14c, XMM14d,
- XMM15, XMM15b, XMM15c, XMM15d
-#endif
- );
+ XMM15, XMM15b, XMM15c, XMM15d);
// Class for all 128bit vector registers
reg_class vectorx_reg_evex(XMM0, XMM0b, XMM0c, XMM0d,
@@ -991,9 +953,8 @@ reg_class vectorx_reg_evex(XMM0, XMM0b, XMM0c, XMM0d,
XMM4, XMM4b, XMM4c, XMM4d,
XMM5, XMM5b, XMM5c, XMM5d,
XMM6, XMM6b, XMM6c, XMM6d,
- XMM7, XMM7b, XMM7c, XMM7d
-#ifdef _LP64
- ,XMM8, XMM8b, XMM8c, XMM8d,
+ XMM7, XMM7b, XMM7c, XMM7d,
+ XMM8, XMM8b, XMM8c, XMM8d,
XMM9, XMM9b, XMM9c, XMM9d,
XMM10, XMM10b, XMM10c, XMM10d,
XMM11, XMM11b, XMM11c, XMM11d,
@@ -1016,9 +977,7 @@ reg_class vectorx_reg_evex(XMM0, XMM0b, XMM0c, XMM0d,
XMM28, XMM28b, XMM28c, XMM28d,
XMM29, XMM29b, XMM29c, XMM29d,
XMM30, XMM30b, XMM30c, XMM30d,
- XMM31, XMM31b, XMM31c, XMM31d
-#endif
- );
+ XMM31, XMM31b, XMM31c, XMM31d);
reg_class_dynamic vectorx_reg(vectorx_reg_evex, vectorx_reg_legacy, %{ VM_Version::supports_evex() %} );
reg_class_dynamic vectorx_reg_vlbwdq(vectorx_reg_evex, vectorx_reg_legacy, %{ VM_Version::supports_avx512vlbwdq() %} );
@@ -1031,18 +990,15 @@ reg_class vectory_reg_legacy(XMM0, XMM0b, XMM0c, XMM0d, XMM0e, XMM0f, XMM0
XMM4, XMM4b, XMM4c, XMM4d, XMM4e, XMM4f, XMM4g, XMM4h,
XMM5, XMM5b, XMM5c, XMM5d, XMM5e, XMM5f, XMM5g, XMM5h,
XMM6, XMM6b, XMM6c, XMM6d, XMM6e, XMM6f, XMM6g, XMM6h,
- XMM7, XMM7b, XMM7c, XMM7d, XMM7e, XMM7f, XMM7g, XMM7h
-#ifdef _LP64
- ,XMM8, XMM8b, XMM8c, XMM8d, XMM8e, XMM8f, XMM8g, XMM8h,
+ XMM7, XMM7b, XMM7c, XMM7d, XMM7e, XMM7f, XMM7g, XMM7h,
+ XMM8, XMM8b, XMM8c, XMM8d, XMM8e, XMM8f, XMM8g, XMM8h,
XMM9, XMM9b, XMM9c, XMM9d, XMM9e, XMM9f, XMM9g, XMM9h,
XMM10, XMM10b, XMM10c, XMM10d, XMM10e, XMM10f, XMM10g, XMM10h,
XMM11, XMM11b, XMM11c, XMM11d, XMM11e, XMM11f, XMM11g, XMM11h,
XMM12, XMM12b, XMM12c, XMM12d, XMM12e, XMM12f, XMM12g, XMM12h,
XMM13, XMM13b, XMM13c, XMM13d, XMM13e, XMM13f, XMM13g, XMM13h,
XMM14, XMM14b, XMM14c, XMM14d, XMM14e, XMM14f, XMM14g, XMM14h,
- XMM15, XMM15b, XMM15c, XMM15d, XMM15e, XMM15f, XMM15g, XMM15h
-#endif
- );
+ XMM15, XMM15b, XMM15c, XMM15d, XMM15e, XMM15f, XMM15g, XMM15h);
// Class for all 256bit vector registers
reg_class vectory_reg_evex(XMM0, XMM0b, XMM0c, XMM0d, XMM0e, XMM0f, XMM0g, XMM0h,
@@ -1052,9 +1008,8 @@ reg_class vectory_reg_evex(XMM0, XMM0b, XMM0c, XMM0d, XMM0e, XMM0f, XMM0g,
XMM4, XMM4b, XMM4c, XMM4d, XMM4e, XMM4f, XMM4g, XMM4h,
XMM5, XMM5b, XMM5c, XMM5d, XMM5e, XMM5f, XMM5g, XMM5h,
XMM6, XMM6b, XMM6c, XMM6d, XMM6e, XMM6f, XMM6g, XMM6h,
- XMM7, XMM7b, XMM7c, XMM7d, XMM7e, XMM7f, XMM7g, XMM7h
-#ifdef _LP64
- ,XMM8, XMM8b, XMM8c, XMM8d, XMM8e, XMM8f, XMM8g, XMM8h,
+ XMM7, XMM7b, XMM7c, XMM7d, XMM7e, XMM7f, XMM7g, XMM7h,
+ XMM8, XMM8b, XMM8c, XMM8d, XMM8e, XMM8f, XMM8g, XMM8h,
XMM9, XMM9b, XMM9c, XMM9d, XMM9e, XMM9f, XMM9g, XMM9h,
XMM10, XMM10b, XMM10c, XMM10d, XMM10e, XMM10f, XMM10g, XMM10h,
XMM11, XMM11b, XMM11c, XMM11d, XMM11e, XMM11f, XMM11g, XMM11h,
@@ -1077,9 +1032,7 @@ reg_class vectory_reg_evex(XMM0, XMM0b, XMM0c, XMM0d, XMM0e, XMM0f, XMM0g,
XMM28, XMM28b, XMM28c, XMM28d, XMM28e, XMM28f, XMM28g, XMM28h,
XMM29, XMM29b, XMM29c, XMM29d, XMM29e, XMM29f, XMM29g, XMM29h,
XMM30, XMM30b, XMM30c, XMM30d, XMM30e, XMM30f, XMM30g, XMM30h,
- XMM31, XMM31b, XMM31c, XMM31d, XMM31e, XMM31f, XMM31g, XMM31h
-#endif
- );
+ XMM31, XMM31b, XMM31c, XMM31d, XMM31e, XMM31f, XMM31g, XMM31h);
reg_class_dynamic vectory_reg(vectory_reg_evex, vectory_reg_legacy, %{ VM_Version::supports_evex() %} );
reg_class_dynamic vectory_reg_vlbwdq(vectory_reg_evex, vectory_reg_legacy, %{ VM_Version::supports_avx512vlbwdq() %} );
@@ -1092,17 +1045,16 @@ reg_class vectorz_reg_evex(XMM0, XMM0b, XMM0c, XMM0d, XMM0e, XMM0f, XMM0g,
XMM4, XMM4b, XMM4c, XMM4d, XMM4e, XMM4f, XMM4g, XMM4h, XMM4i, XMM4j, XMM4k, XMM4l, XMM4m, XMM4n, XMM4o, XMM4p,
XMM5, XMM5b, XMM5c, XMM5d, XMM5e, XMM5f, XMM5g, XMM5h, XMM5i, XMM5j, XMM5k, XMM5l, XMM5m, XMM5n, XMM5o, XMM5p,
XMM6, XMM6b, XMM6c, XMM6d, XMM6e, XMM6f, XMM6g, XMM6h, XMM6i, XMM6j, XMM6k, XMM6l, XMM6m, XMM6n, XMM6o, XMM6p,
- XMM7, XMM7b, XMM7c, XMM7d, XMM7e, XMM7f, XMM7g, XMM7h, XMM7i, XMM7j, XMM7k, XMM7l, XMM7m, XMM7n, XMM7o, XMM7p
-#ifdef _LP64
- ,XMM8, XMM8b, XMM8c, XMM8d, XMM8e, XMM8f, XMM8g, XMM8h, XMM8i, XMM8j, XMM8k, XMM8l, XMM8m, XMM8n, XMM8o, XMM8p,
+ XMM7, XMM7b, XMM7c, XMM7d, XMM7e, XMM7f, XMM7g, XMM7h, XMM7i, XMM7j, XMM7k, XMM7l, XMM7m, XMM7n, XMM7o, XMM7p,
+ XMM8, XMM8b, XMM8c, XMM8d, XMM8e, XMM8f, XMM8g, XMM8h, XMM8i, XMM8j, XMM8k, XMM8l, XMM8m, XMM8n, XMM8o, XMM8p,
XMM9, XMM9b, XMM9c, XMM9d, XMM9e, XMM9f, XMM9g, XMM9h, XMM9i, XMM9j, XMM9k, XMM9l, XMM9m, XMM9n, XMM9o, XMM9p,
XMM10, XMM10b, XMM10c, XMM10d, XMM10e, XMM10f, XMM10g, XMM10h, XMM10i, XMM10j, XMM10k, XMM10l, XMM10m, XMM10n, XMM10o, XMM10p,
XMM11, XMM11b, XMM11c, XMM11d, XMM11e, XMM11f, XMM11g, XMM11h, XMM11i, XMM11j, XMM11k, XMM11l, XMM11m, XMM11n, XMM11o, XMM11p,
XMM12, XMM12b, XMM12c, XMM12d, XMM12e, XMM12f, XMM12g, XMM12h, XMM12i, XMM12j, XMM12k, XMM12l, XMM12m, XMM12n, XMM12o, XMM12p,
XMM13, XMM13b, XMM13c, XMM13d, XMM13e, XMM13f, XMM13g, XMM13h, XMM13i, XMM13j, XMM13k, XMM13l, XMM13m, XMM13n, XMM13o, XMM13p,
XMM14, XMM14b, XMM14c, XMM14d, XMM14e, XMM14f, XMM14g, XMM14h, XMM14i, XMM14j, XMM14k, XMM14l, XMM14m, XMM14n, XMM14o, XMM14p,
- XMM15, XMM15b, XMM15c, XMM15d, XMM15e, XMM15f, XMM15g, XMM15h, XMM15i, XMM15j, XMM15k, XMM15l, XMM15m, XMM15n, XMM15o, XMM15p
- ,XMM16, XMM16b, XMM16c, XMM16d, XMM16e, XMM16f, XMM16g, XMM16h, XMM16i, XMM16j, XMM16k, XMM16l, XMM16m, XMM16n, XMM16o, XMM16p,
+ XMM15, XMM15b, XMM15c, XMM15d, XMM15e, XMM15f, XMM15g, XMM15h, XMM15i, XMM15j, XMM15k, XMM15l, XMM15m, XMM15n, XMM15o, XMM15p,
+ XMM16, XMM16b, XMM16c, XMM16d, XMM16e, XMM16f, XMM16g, XMM16h, XMM16i, XMM16j, XMM16k, XMM16l, XMM16m, XMM16n, XMM16o, XMM16p,
XMM17, XMM17b, XMM17c, XMM17d, XMM17e, XMM17f, XMM17g, XMM17h, XMM17i, XMM17j, XMM17k, XMM17l, XMM17m, XMM17n, XMM17o, XMM17p,
XMM18, XMM18b, XMM18c, XMM18d, XMM18e, XMM18f, XMM18g, XMM18h, XMM18i, XMM18j, XMM18k, XMM18l, XMM18m, XMM18n, XMM18o, XMM18p,
XMM19, XMM19b, XMM19c, XMM19d, XMM19e, XMM19f, XMM19g, XMM19h, XMM19i, XMM19j, XMM19k, XMM19l, XMM19m, XMM19n, XMM19o, XMM19p,
@@ -1117,9 +1069,7 @@ reg_class vectorz_reg_evex(XMM0, XMM0b, XMM0c, XMM0d, XMM0e, XMM0f, XMM0g,
XMM28, XMM28b, XMM28c, XMM28d, XMM28e, XMM28f, XMM28g, XMM28h, XMM28i, XMM28j, XMM28k, XMM28l, XMM28m, XMM28n, XMM28o, XMM28p,
XMM29, XMM29b, XMM29c, XMM29d, XMM29e, XMM29f, XMM29g, XMM29h, XMM29i, XMM29j, XMM29k, XMM29l, XMM29m, XMM29n, XMM29o, XMM29p,
XMM30, XMM30b, XMM30c, XMM30d, XMM30e, XMM30f, XMM30g, XMM30h, XMM30i, XMM30j, XMM30k, XMM30l, XMM30m, XMM30n, XMM30o, XMM30p,
- XMM31, XMM31b, XMM31c, XMM31d, XMM31e, XMM31f, XMM31g, XMM31h, XMM31i, XMM31j, XMM31k, XMM31l, XMM31m, XMM31n, XMM31o, XMM31p
-#endif
- );
+ XMM31, XMM31b, XMM31c, XMM31d, XMM31e, XMM31f, XMM31g, XMM31h, XMM31i, XMM31j, XMM31k, XMM31l, XMM31m, XMM31n, XMM31o, XMM31p);
// Class for restricted 512bit vector registers
reg_class vectorz_reg_legacy(XMM0, XMM0b, XMM0c, XMM0d, XMM0e, XMM0f, XMM0g, XMM0h, XMM0i, XMM0j, XMM0k, XMM0l, XMM0m, XMM0n, XMM0o, XMM0p,
@@ -1129,18 +1079,15 @@ reg_class vectorz_reg_legacy(XMM0, XMM0b, XMM0c, XMM0d, XMM0e, XMM0f, XMM0
XMM4, XMM4b, XMM4c, XMM4d, XMM4e, XMM4f, XMM4g, XMM4h, XMM4i, XMM4j, XMM4k, XMM4l, XMM4m, XMM4n, XMM4o, XMM4p,
XMM5, XMM5b, XMM5c, XMM5d, XMM5e, XMM5f, XMM5g, XMM5h, XMM5i, XMM5j, XMM5k, XMM5l, XMM5m, XMM5n, XMM5o, XMM5p,
XMM6, XMM6b, XMM6c, XMM6d, XMM6e, XMM6f, XMM6g, XMM6h, XMM6i, XMM6j, XMM6k, XMM6l, XMM6m, XMM6n, XMM6o, XMM6p,
- XMM7, XMM7b, XMM7c, XMM7d, XMM7e, XMM7f, XMM7g, XMM7h, XMM7i, XMM7j, XMM7k, XMM7l, XMM7m, XMM7n, XMM7o, XMM7p
-#ifdef _LP64
- ,XMM8, XMM8b, XMM8c, XMM8d, XMM8e, XMM8f, XMM8g, XMM8h, XMM8i, XMM8j, XMM8k, XMM8l, XMM8m, XMM8n, XMM8o, XMM8p,
+ XMM7, XMM7b, XMM7c, XMM7d, XMM7e, XMM7f, XMM7g, XMM7h, XMM7i, XMM7j, XMM7k, XMM7l, XMM7m, XMM7n, XMM7o, XMM7p,
+ XMM8, XMM8b, XMM8c, XMM8d, XMM8e, XMM8f, XMM8g, XMM8h, XMM8i, XMM8j, XMM8k, XMM8l, XMM8m, XMM8n, XMM8o, XMM8p,
XMM9, XMM9b, XMM9c, XMM9d, XMM9e, XMM9f, XMM9g, XMM9h, XMM9i, XMM9j, XMM9k, XMM9l, XMM9m, XMM9n, XMM9o, XMM9p,
XMM10, XMM10b, XMM10c, XMM10d, XMM10e, XMM10f, XMM10g, XMM10h, XMM10i, XMM10j, XMM10k, XMM10l, XMM10m, XMM10n, XMM10o, XMM10p,
XMM11, XMM11b, XMM11c, XMM11d, XMM11e, XMM11f, XMM11g, XMM11h, XMM11i, XMM11j, XMM11k, XMM11l, XMM11m, XMM11n, XMM11o, XMM11p,
XMM12, XMM12b, XMM12c, XMM12d, XMM12e, XMM12f, XMM12g, XMM12h, XMM12i, XMM12j, XMM12k, XMM12l, XMM12m, XMM12n, XMM12o, XMM12p,
XMM13, XMM13b, XMM13c, XMM13d, XMM13e, XMM13f, XMM13g, XMM13h, XMM13i, XMM13j, XMM13k, XMM13l, XMM13m, XMM13n, XMM13o, XMM13p,
XMM14, XMM14b, XMM14c, XMM14d, XMM14e, XMM14f, XMM14g, XMM14h, XMM14i, XMM14j, XMM14k, XMM14l, XMM14m, XMM14n, XMM14o, XMM14p,
- XMM15, XMM15b, XMM15c, XMM15d, XMM15e, XMM15f, XMM15g, XMM15h, XMM15i, XMM15j, XMM15k, XMM15l, XMM15m, XMM15n, XMM15o, XMM15p
-#endif
- );
+ XMM15, XMM15b, XMM15c, XMM15d, XMM15e, XMM15f, XMM15g, XMM15h, XMM15i, XMM15j, XMM15k, XMM15l, XMM15m, XMM15n, XMM15o, XMM15p);
reg_class_dynamic vectorz_reg (vectorz_reg_evex, vectorz_reg_legacy, %{ VM_Version::supports_evex() %} );
reg_class_dynamic vectorz_reg_vl(vectorz_reg_evex, vectorz_reg_legacy, %{ VM_Version::supports_evex() && VM_Version::supports_avx512vl() %} );
@@ -1199,21 +1146,10 @@ class HandlerImpl {
return NativeJump::instruction_size;
}
-#ifdef _LP64
static uint size_deopt_handler() {
// three 5 byte instructions plus one move for unreachable address.
return 15+3;
}
-#else
- static uint size_deopt_handler() {
- // NativeCall instruction size is the same as NativeJump.
- // exception handler starts out as jump and can be patched to
- // a call be deoptimization. (4932387)
- // Note that this value is also credited (in output.cpp) to
- // the size of the code section.
- return 5 + NativeJump::instruction_size; // pushl(); jmp;
- }
-#endif
};
inline Assembler::AvxVectorLen vector_length_encoding(int bytes) {
@@ -1334,7 +1270,6 @@ int HandlerImpl::emit_deopt_handler(C2_MacroAssembler* masm) {
}
int offset = __ offset();
-#ifdef _LP64
address the_pc = (address) __ pc();
Label next;
// push a "the_pc" on the stack without destroying any registers
@@ -1345,10 +1280,6 @@ int HandlerImpl::emit_deopt_handler(C2_MacroAssembler* masm) {
__ bind(next);
// adjust it so it matches "the_pc"
__ subptr(Address(rsp, 0), __ offset() - offset);
-#else
- InternalAddress here(__ pc());
- __ pushptr(here.addr(), noreg);
-#endif
__ jump(RuntimeAddress(SharedRuntime::deopt_blob()->unpack()));
assert(__ offset() - offset <= (int) size_deopt_handler(), "overflow %d", (__ offset() - offset));
@@ -1372,17 +1303,10 @@ static Assembler::Width widthForType(BasicType bt) {
//=============================================================================
// Float masks come from different places depending on platform.
-#ifdef _LP64
static address float_signmask() { return StubRoutines::x86::float_sign_mask(); }
static address float_signflip() { return StubRoutines::x86::float_sign_flip(); }
static address double_signmask() { return StubRoutines::x86::double_sign_mask(); }
static address double_signflip() { return StubRoutines::x86::double_sign_flip(); }
-#else
- static address float_signmask() { return (address)float_signmask_pool; }
- static address float_signflip() { return (address)float_signflip_pool; }
- static address double_signmask() { return (address)double_signmask_pool; }
- static address double_signflip() { return (address)double_signflip_pool; }
-#endif
static address vector_short_to_byte_mask() { return StubRoutines::x86::vector_short_to_byte_mask(); }
static address vector_int_to_byte_mask() { return StubRoutines::x86::vector_int_to_byte_mask(); }
static address vector_byte_perm_mask() { return StubRoutines::x86::vector_byte_perm_mask(); }
@@ -1404,7 +1328,6 @@ bool Matcher::match_rule_supported(int opcode) {
if (!has_match_rule(opcode)) {
return false; // no match rule present
}
- const bool is_LP64 = LP64_ONLY(true) NOT_LP64(false);
switch (opcode) {
case Op_AbsVL:
case Op_StoreVectorScatter:
@@ -1445,11 +1368,6 @@ bool Matcher::match_rule_supported(int opcode) {
return false;
}
break;
- case Op_AddReductionVL:
- if (UseSSE < 2) { // requires at least SSE2
- return false;
- }
- break;
case Op_AbsVB:
case Op_AbsVS:
case Op_AbsVI:
@@ -1509,7 +1427,7 @@ bool Matcher::match_rule_supported(int opcode) {
}
break;
case Op_PopulateIndex:
- if (!is_LP64 || (UseAVX < 2)) {
+ if (UseAVX < 2) {
return false;
}
break;
@@ -1524,9 +1442,7 @@ bool Matcher::match_rule_supported(int opcode) {
}
break;
case Op_CompareAndSwapL:
-#ifdef _LP64
case Op_CompareAndSwapP:
-#endif
break;
case Op_StrIndexOf:
if (!UseSSE42Intrinsics) {
@@ -1555,7 +1471,6 @@ bool Matcher::match_rule_supported(int opcode) {
return false;
}
break;
-#ifdef _LP64
case Op_MaxD:
case Op_MaxF:
case Op_MinD:
@@ -1564,7 +1479,6 @@ bool Matcher::match_rule_supported(int opcode) {
return false;
}
break;
-#endif
case Op_CacheWB:
case Op_CacheWBPreSync:
case Op_CacheWBPostSync:
@@ -1607,7 +1521,7 @@ bool Matcher::match_rule_supported(int opcode) {
case Op_VectorCmpMasked:
case Op_VectorMaskGen:
- if (!is_LP64 || UseAVX < 3 || !VM_Version::supports_bmi2()) {
+ if (UseAVX < 3 || !VM_Version::supports_bmi2()) {
return false;
}
break;
@@ -1615,60 +1529,25 @@ bool Matcher::match_rule_supported(int opcode) {
case Op_VectorMaskLastTrue:
case Op_VectorMaskTrueCount:
case Op_VectorMaskToLong:
- if (!is_LP64 || UseAVX < 1) {
+ if (UseAVX < 1) {
return false;
}
break;
case Op_RoundF:
case Op_RoundD:
- if (!is_LP64) {
- return false;
- }
break;
case Op_CopySignD:
case Op_CopySignF:
- if (UseAVX < 3 || !is_LP64) {
+ if (UseAVX < 3) {
return false;
}
if (!VM_Version::supports_avx512vl()) {
return false;
}
break;
-#ifndef _LP64
- case Op_AddReductionVF:
- case Op_AddReductionVD:
- case Op_MulReductionVF:
- case Op_MulReductionVD:
- if (UseSSE < 1) { // requires at least SSE
- return false;
- }
- break;
- case Op_MulAddVS2VI:
- case Op_RShiftVL:
- case Op_AbsVD:
- case Op_NegVD:
- if (UseSSE < 2) {
- return false;
- }
- break;
-#endif // !LP64
case Op_CompressBits:
- if (!VM_Version::supports_bmi2() || (!is_LP64 && UseSSE < 2)) {
- return false;
- }
- break;
case Op_ExpandBits:
- if (!VM_Version::supports_bmi2() || (!is_LP64 && (UseSSE < 2 || !VM_Version::supports_bmi1()))) {
- return false;
- }
- break;
- case Op_SignumF:
- if (UseSSE < 1) {
- return false;
- }
- break;
- case Op_SignumD:
- if (UseSSE < 2) {
+ if (!VM_Version::supports_bmi2()) {
return false;
}
break;
@@ -1677,21 +1556,6 @@ bool Matcher::match_rule_supported(int opcode) {
return false;
}
break;
- case Op_SqrtF:
- if (UseSSE < 1) {
- return false;
- }
- break;
- case Op_SqrtD:
-#ifdef _LP64
- if (UseSSE < 2) {
- return false;
- }
-#else
- // x86_32.ad has a special match rule for SqrtD.
- // Together with common x86 rules, this handles all UseSSE cases.
-#endif
- break;
case Op_ConvF2HF:
case Op_ConvHF2F:
if (!VM_Version::supports_float16()) {
@@ -1722,7 +1586,6 @@ bool Matcher::match_rule_supported_auto_vectorization(int opcode, int vlen, Basi
// Identify extra cases that we might want to provide match rules for vector nodes and
// other intrinsics guarded with vector length (vlen) and element type (bt).
bool Matcher::match_rule_supported_vector(int opcode, int vlen, BasicType bt) {
- const bool is_LP64 = LP64_ONLY(true) NOT_LP64(false);
if (!match_rule_supported(opcode)) {
return false;
}
@@ -1743,6 +1606,24 @@ bool Matcher::match_rule_supported_vector(int opcode, int vlen, BasicType bt) {
// * 128bit vroundpd instruction is present only in AVX1
int size_in_bits = vlen * type2aelembytes(bt) * BitsPerByte;
switch (opcode) {
+ case Op_MaxVHF:
+ case Op_MinVHF:
+ if (!VM_Version::supports_avx512bw()) {
+ return false;
+ }
+ case Op_AddVHF:
+ case Op_DivVHF:
+ case Op_FmaVHF:
+ case Op_MulVHF:
+ case Op_SubVHF:
+ case Op_SqrtVHF:
+ if (size_in_bits < 512 && !VM_Version::supports_avx512vl()) {
+ return false;
+ }
+ if (!VM_Version::supports_avx512_fp16()) {
+ return false;
+ }
+ break;
case Op_AbsVF:
case Op_NegVF:
if ((vlen == 16) && (VM_Version::supports_avx512dq() == false)) {
@@ -1769,7 +1650,7 @@ bool Matcher::match_rule_supported_vector(int opcode, int vlen, BasicType bt) {
case Op_ClearArray:
case Op_VectorMaskGen:
case Op_VectorCmpMasked:
- if (!is_LP64 || !VM_Version::supports_avx512bw()) {
+ if (!VM_Version::supports_avx512bw()) {
return false;
}
if ((size_in_bits != 512) && !VM_Version::supports_avx512vl()) {
@@ -1819,19 +1700,7 @@ bool Matcher::match_rule_supported_vector(int opcode, int vlen, BasicType bt) {
if (is_subword_type(bt) && (UseSSE < 4)) {
return false;
}
-#ifndef _LP64
- if (bt == T_BYTE || bt == T_LONG) {
- return false;
- }
-#endif
break;
-#ifndef _LP64
- case Op_VectorInsert:
- if (bt == T_LONG || bt == T_DOUBLE) {
- return false;
- }
- break;
-#endif
case Op_MinReductionV:
case Op_MaxReductionV:
if ((bt == T_INT || is_subword_type(bt)) && UseSSE < 4) {
@@ -1846,11 +1715,6 @@ bool Matcher::match_rule_supported_vector(int opcode, int vlen, BasicType bt) {
if (UseAVX > 2 && (!VM_Version::supports_avx512dq() && size_in_bits == 512)) {
return false;
}
-#ifndef _LP64
- if (bt == T_BYTE || bt == T_LONG) {
- return false;
- }
-#endif
break;
case Op_VectorTest:
if (UseSSE < 4) {
@@ -1935,10 +1799,9 @@ bool Matcher::match_rule_supported_vector(int opcode, int vlen, BasicType bt) {
return false;
}
if (is_subword_type(bt) &&
- (!is_LP64 ||
- (size_in_bits > 256 && !VM_Version::supports_avx512bw()) ||
- (size_in_bits < 64) ||
- (bt == T_SHORT && !VM_Version::supports_bmi2()))) {
+ ((size_in_bits > 256 && !VM_Version::supports_avx512bw()) ||
+ (size_in_bits < 64) ||
+ (bt == T_SHORT && !VM_Version::supports_bmi2()))) {
return false;
}
break;
@@ -2007,14 +1870,11 @@ bool Matcher::match_rule_supported_vector(int opcode, int vlen, BasicType bt) {
if (is_subword_type(bt) && !VM_Version::supports_avx512_vbmi2()) {
return false;
}
- if (!is_LP64 && !VM_Version::supports_avx512vl() && size_in_bits < 512) {
- return false;
- }
if (size_in_bits < 128 ) {
return false;
}
case Op_VectorLongToMask:
- if (UseAVX < 1 || !is_LP64) {
+ if (UseAVX < 1) {
return false;
}
if (UseAVX < 3 && !VM_Version::supports_bmi2()) {
@@ -2062,7 +1922,6 @@ bool Matcher::match_rule_supported_vector_masked(int opcode, int vlen, BasicType
return false;
}
- const bool is_LP64 = LP64_ONLY(true) NOT_LP64(false);
int size_in_bits = vlen * type2aelembytes(bt) * BitsPerByte;
if (size_in_bits != 512 && !VM_Version::supports_avx512vl()) {
return false;
@@ -2307,7 +2166,6 @@ const RegMask* Matcher::predicate_reg_mask(void) {
// Max vector size in bytes. 0 if not supported.
int Matcher::vector_width_in_bytes(BasicType bt) {
assert(is_java_primitive(bt), "only primitive type vectors");
- if (UseSSE < 2) return 0;
// SSE2 supports 128bit vectors for all types.
// AVX2 supports 256bit vectors for all types.
// AVX2/EVEX supports 512bit vectors for all types.
@@ -2398,7 +2256,6 @@ static bool clone_shift(Node* shift, Matcher* matcher, Matcher::MStack& mstack,
address_visited.set(shift->_idx); // Flag as address_visited
mstack.push(shift->in(2), Matcher::Visit);
Node *conv = shift->in(1);
-#ifdef _LP64
// Allow Matcher to match the rule which bypass
// ConvI2L operation for an array index on LP64
// if the index value is positive.
@@ -2408,9 +2265,9 @@ static bool clone_shift(Node* shift, Matcher* matcher, Matcher::MStack& mstack,
!matcher->is_visited(conv)) {
address_visited.set(conv->_idx); // Flag as address_visited
mstack.push(conv->in(1), Matcher::Pre_Visit);
- } else
-#endif
+ } else {
mstack.push(conv, Matcher::Pre_Visit);
+ }
return true;
}
return false;
@@ -2548,7 +2405,7 @@ bool Matcher::pd_clone_address_expressions(AddPNode* m, Matcher::MStack& mstack,
if (adr->is_AddP() &&
!adr->in(AddPNode::Base)->is_top() &&
!adr->in(AddPNode::Offset)->is_Con() &&
- LP64_ONLY( off->get_long() == (int) (off->get_long()) && ) // immL32
+ off->get_long() == (int) (off->get_long()) && // immL32
// Are there other uses besides address expressions?
!is_visited(adr)) {
address_visited.set(adr->_idx); // Flag as address_visited
@@ -2622,26 +2479,18 @@ static void vec_mov_helper(C2_MacroAssembler *masm, int src_lo, int dst_lo,
case Op_VecS: // copy whole register
case Op_VecD:
case Op_VecX:
-#ifndef _LP64
- __ movdqu(as_XMMRegister(Matcher::_regEncode[dst_lo]), as_XMMRegister(Matcher::_regEncode[src_lo]));
-#else
if ((UseAVX < 3) || VM_Version::supports_avx512vl()) {
__ movdqu(as_XMMRegister(Matcher::_regEncode[dst_lo]), as_XMMRegister(Matcher::_regEncode[src_lo]));
} else {
__ vextractf32x4(as_XMMRegister(Matcher::_regEncode[dst_lo]), as_XMMRegister(Matcher::_regEncode[src_lo]), 0x0);
}
-#endif
break;
case Op_VecY:
-#ifndef _LP64
- __ vmovdqu(as_XMMRegister(Matcher::_regEncode[dst_lo]), as_XMMRegister(Matcher::_regEncode[src_lo]));
-#else
if ((UseAVX < 3) || VM_Version::supports_avx512vl()) {
__ vmovdqu(as_XMMRegister(Matcher::_regEncode[dst_lo]), as_XMMRegister(Matcher::_regEncode[src_lo]));
} else {
__ vextractf64x4(as_XMMRegister(Matcher::_regEncode[dst_lo]), as_XMMRegister(Matcher::_regEncode[src_lo]), 0x0);
}
-#endif
break;
case Op_VecZ:
__ evmovdquq(as_XMMRegister(Matcher::_regEncode[dst_lo]), as_XMMRegister(Matcher::_regEncode[src_lo]), 2);
@@ -2680,28 +2529,20 @@ void vec_spill_helper(C2_MacroAssembler *masm, bool is_load,
__ movq(as_XMMRegister(Matcher::_regEncode[reg]), Address(rsp, stack_offset));
break;
case Op_VecX:
-#ifndef _LP64
- __ movdqu(as_XMMRegister(Matcher::_regEncode[reg]), Address(rsp, stack_offset));
-#else
if ((UseAVX < 3) || VM_Version::supports_avx512vl()) {
__ movdqu(as_XMMRegister(Matcher::_regEncode[reg]), Address(rsp, stack_offset));
} else {
__ vpxor(as_XMMRegister(Matcher::_regEncode[reg]), as_XMMRegister(Matcher::_regEncode[reg]), as_XMMRegister(Matcher::_regEncode[reg]), 2);
__ vinsertf32x4(as_XMMRegister(Matcher::_regEncode[reg]), as_XMMRegister(Matcher::_regEncode[reg]), Address(rsp, stack_offset),0x0);
}
-#endif
break;
case Op_VecY:
-#ifndef _LP64
- __ vmovdqu(as_XMMRegister(Matcher::_regEncode[reg]), Address(rsp, stack_offset));
-#else
if ((UseAVX < 3) || VM_Version::supports_avx512vl()) {
__ vmovdqu(as_XMMRegister(Matcher::_regEncode[reg]), Address(rsp, stack_offset));
} else {
__ vpxor(as_XMMRegister(Matcher::_regEncode[reg]), as_XMMRegister(Matcher::_regEncode[reg]), as_XMMRegister(Matcher::_regEncode[reg]), 2);
__ vinsertf64x4(as_XMMRegister(Matcher::_regEncode[reg]), as_XMMRegister(Matcher::_regEncode[reg]), Address(rsp, stack_offset),0x0);
}
-#endif
break;
case Op_VecZ:
__ evmovdquq(as_XMMRegister(Matcher::_regEncode[reg]), Address(rsp, stack_offset), 2);
@@ -2718,28 +2559,20 @@ void vec_spill_helper(C2_MacroAssembler *masm, bool is_load,
__ movq(Address(rsp, stack_offset), as_XMMRegister(Matcher::_regEncode[reg]));
break;
case Op_VecX:
-#ifndef _LP64
- __ movdqu(Address(rsp, stack_offset), as_XMMRegister(Matcher::_regEncode[reg]));
-#else
if ((UseAVX < 3) || VM_Version::supports_avx512vl()) {
__ movdqu(Address(rsp, stack_offset), as_XMMRegister(Matcher::_regEncode[reg]));
}
else {
__ vextractf32x4(Address(rsp, stack_offset), as_XMMRegister(Matcher::_regEncode[reg]), 0x0);
}
-#endif
break;
case Op_VecY:
-#ifndef _LP64
- __ vmovdqu(Address(rsp, stack_offset), as_XMMRegister(Matcher::_regEncode[reg]));
-#else
if ((UseAVX < 3) || VM_Version::supports_avx512vl()) {
__ vmovdqu(Address(rsp, stack_offset), as_XMMRegister(Matcher::_regEncode[reg]));
}
else {
__ vextractf64x4(Address(rsp, stack_offset), as_XMMRegister(Matcher::_regEncode[reg]), 0x0);
}
-#endif
break;
case Op_VecZ:
__ evmovdquq(Address(rsp, stack_offset), as_XMMRegister(Matcher::_regEncode[reg]), 2);
@@ -3033,7 +2866,8 @@ instruct ShouldNotReachHere() %{
format %{ "stop\t# ShouldNotReachHere" %}
ins_encode %{
if (is_reachable()) {
- __ stop(_halt_reason);
+ const char* str = __ code_string(_halt_reason);
+ __ stop(str);
}
%}
ins_pipe(pipe_slow);
@@ -3042,7 +2876,7 @@ instruct ShouldNotReachHere() %{
// ============================================================================
instruct addF_reg(regF dst, regF src) %{
- predicate((UseSSE>=1) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (AddF dst src));
format %{ "addss $dst, $src" %}
@@ -3054,7 +2888,7 @@ instruct addF_reg(regF dst, regF src) %{
%}
instruct addF_mem(regF dst, memory src) %{
- predicate((UseSSE>=1) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (AddF dst (LoadF src)));
format %{ "addss $dst, $src" %}
@@ -3066,7 +2900,7 @@ instruct addF_mem(regF dst, memory src) %{
%}
instruct addF_imm(regF dst, immF con) %{
- predicate((UseSSE>=1) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (AddF dst con));
format %{ "addss $dst, [$constantaddress]\t# load from constant table: float=$con" %}
ins_cost(150);
@@ -3113,7 +2947,7 @@ instruct addF_reg_imm(regF dst, regF src, immF con) %{
%}
instruct addD_reg(regD dst, regD src) %{
- predicate((UseSSE>=2) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (AddD dst src));
format %{ "addsd $dst, $src" %}
@@ -3125,7 +2959,7 @@ instruct addD_reg(regD dst, regD src) %{
%}
instruct addD_mem(regD dst, memory src) %{
- predicate((UseSSE>=2) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (AddD dst (LoadD src)));
format %{ "addsd $dst, $src" %}
@@ -3137,7 +2971,7 @@ instruct addD_mem(regD dst, memory src) %{
%}
instruct addD_imm(regD dst, immD con) %{
- predicate((UseSSE>=2) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (AddD dst con));
format %{ "addsd $dst, [$constantaddress]\t# load from constant table: double=$con" %}
ins_cost(150);
@@ -3184,7 +3018,7 @@ instruct addD_reg_imm(regD dst, regD src, immD con) %{
%}
instruct subF_reg(regF dst, regF src) %{
- predicate((UseSSE>=1) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (SubF dst src));
format %{ "subss $dst, $src" %}
@@ -3196,7 +3030,7 @@ instruct subF_reg(regF dst, regF src) %{
%}
instruct subF_mem(regF dst, memory src) %{
- predicate((UseSSE>=1) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (SubF dst (LoadF src)));
format %{ "subss $dst, $src" %}
@@ -3208,7 +3042,7 @@ instruct subF_mem(regF dst, memory src) %{
%}
instruct subF_imm(regF dst, immF con) %{
- predicate((UseSSE>=1) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (SubF dst con));
format %{ "subss $dst, [$constantaddress]\t# load from constant table: float=$con" %}
ins_cost(150);
@@ -3255,7 +3089,7 @@ instruct subF_reg_imm(regF dst, regF src, immF con) %{
%}
instruct subD_reg(regD dst, regD src) %{
- predicate((UseSSE>=2) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (SubD dst src));
format %{ "subsd $dst, $src" %}
@@ -3267,7 +3101,7 @@ instruct subD_reg(regD dst, regD src) %{
%}
instruct subD_mem(regD dst, memory src) %{
- predicate((UseSSE>=2) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (SubD dst (LoadD src)));
format %{ "subsd $dst, $src" %}
@@ -3279,7 +3113,7 @@ instruct subD_mem(regD dst, memory src) %{
%}
instruct subD_imm(regD dst, immD con) %{
- predicate((UseSSE>=2) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (SubD dst con));
format %{ "subsd $dst, [$constantaddress]\t# load from constant table: double=$con" %}
ins_cost(150);
@@ -3326,7 +3160,7 @@ instruct subD_reg_imm(regD dst, regD src, immD con) %{
%}
instruct mulF_reg(regF dst, regF src) %{
- predicate((UseSSE>=1) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (MulF dst src));
format %{ "mulss $dst, $src" %}
@@ -3338,7 +3172,7 @@ instruct mulF_reg(regF dst, regF src) %{
%}
instruct mulF_mem(regF dst, memory src) %{
- predicate((UseSSE>=1) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (MulF dst (LoadF src)));
format %{ "mulss $dst, $src" %}
@@ -3350,7 +3184,7 @@ instruct mulF_mem(regF dst, memory src) %{
%}
instruct mulF_imm(regF dst, immF con) %{
- predicate((UseSSE>=1) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (MulF dst con));
format %{ "mulss $dst, [$constantaddress]\t# load from constant table: float=$con" %}
ins_cost(150);
@@ -3397,7 +3231,7 @@ instruct mulF_reg_imm(regF dst, regF src, immF con) %{
%}
instruct mulD_reg(regD dst, regD src) %{
- predicate((UseSSE>=2) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (MulD dst src));
format %{ "mulsd $dst, $src" %}
@@ -3409,7 +3243,7 @@ instruct mulD_reg(regD dst, regD src) %{
%}
instruct mulD_mem(regD dst, memory src) %{
- predicate((UseSSE>=2) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (MulD dst (LoadD src)));
format %{ "mulsd $dst, $src" %}
@@ -3421,7 +3255,7 @@ instruct mulD_mem(regD dst, memory src) %{
%}
instruct mulD_imm(regD dst, immD con) %{
- predicate((UseSSE>=2) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (MulD dst con));
format %{ "mulsd $dst, [$constantaddress]\t# load from constant table: double=$con" %}
ins_cost(150);
@@ -3468,7 +3302,7 @@ instruct mulD_reg_imm(regD dst, regD src, immD con) %{
%}
instruct divF_reg(regF dst, regF src) %{
- predicate((UseSSE>=1) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (DivF dst src));
format %{ "divss $dst, $src" %}
@@ -3480,7 +3314,7 @@ instruct divF_reg(regF dst, regF src) %{
%}
instruct divF_mem(regF dst, memory src) %{
- predicate((UseSSE>=1) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (DivF dst (LoadF src)));
format %{ "divss $dst, $src" %}
@@ -3492,7 +3326,7 @@ instruct divF_mem(regF dst, memory src) %{
%}
instruct divF_imm(regF dst, immF con) %{
- predicate((UseSSE>=1) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (DivF dst con));
format %{ "divss $dst, [$constantaddress]\t# load from constant table: float=$con" %}
ins_cost(150);
@@ -3539,7 +3373,7 @@ instruct divF_reg_imm(regF dst, regF src, immF con) %{
%}
instruct divD_reg(regD dst, regD src) %{
- predicate((UseSSE>=2) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (DivD dst src));
format %{ "divsd $dst, $src" %}
@@ -3551,7 +3385,7 @@ instruct divD_reg(regD dst, regD src) %{
%}
instruct divD_mem(regD dst, memory src) %{
- predicate((UseSSE>=2) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (DivD dst (LoadD src)));
format %{ "divsd $dst, $src" %}
@@ -3563,7 +3397,7 @@ instruct divD_mem(regD dst, memory src) %{
%}
instruct divD_imm(regD dst, immD con) %{
- predicate((UseSSE>=2) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (DivD dst con));
format %{ "divsd $dst, [$constantaddress]\t# load from constant table: double=$con" %}
ins_cost(150);
@@ -3610,7 +3444,7 @@ instruct divD_reg_imm(regD dst, regD src, immD con) %{
%}
instruct absF_reg(regF dst) %{
- predicate((UseSSE>=1) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (AbsF dst));
ins_cost(150);
format %{ "andps $dst, [0x7fffffff]\t# abs float by sign masking" %}
@@ -3634,7 +3468,7 @@ instruct absF_reg_reg(vlRegF dst, vlRegF src) %{
%}
instruct absD_reg(regD dst) %{
- predicate((UseSSE>=2) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (AbsD dst));
ins_cost(150);
format %{ "andpd $dst, [0x7fffffffffffffff]\t"
@@ -3660,7 +3494,7 @@ instruct absD_reg_reg(vlRegD dst, vlRegD src) %{
%}
instruct negF_reg(regF dst) %{
- predicate((UseSSE>=1) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (NegF dst));
ins_cost(150);
format %{ "xorps $dst, [0x80000000]\t# neg float by sign flipping" %}
@@ -3683,7 +3517,7 @@ instruct negF_reg_reg(vlRegF dst, vlRegF src) %{
%}
instruct negD_reg(regD dst) %{
- predicate((UseSSE>=2) && (UseAVX == 0));
+ predicate(UseAVX == 0);
match(Set dst (NegD dst));
ins_cost(150);
format %{ "xorpd $dst, [0x8000000000000000]\t"
@@ -3710,7 +3544,6 @@ instruct negD_reg_reg(vlRegD dst, vlRegD src) %{
// sqrtss instruction needs destination register to be pre initialized for best performance
// Therefore only the instruct rule where the input is pre-loaded into dst register is defined below
instruct sqrtF_reg(regF dst) %{
- predicate(UseSSE>=1);
match(Set dst (SqrtF dst));
format %{ "sqrtss $dst, $dst" %}
ins_encode %{
@@ -3722,7 +3555,6 @@ instruct sqrtF_reg(regF dst) %{
// sqrtsd instruction needs destination register to be pre initialized for best performance
// Therefore only the instruct rule where the input is pre-loaded into dst register is defined below
instruct sqrtD_reg(regD dst) %{
- predicate(UseSSE>=2);
match(Set dst (SqrtD dst));
format %{ "sqrtsd $dst, $dst" %}
ins_encode %{
@@ -3970,7 +3802,6 @@ instruct reinterpret_shrink(vec dst, legVec src) %{
// ----------------------------------------------------------------------------------------------------
-#ifdef _LP64
instruct roundD_reg(legRegD dst, legRegD src, immU8 rmode) %{
match(Set dst (RoundDoubleMode src rmode));
format %{ "roundsd $dst,$src" %}
@@ -4041,7 +3872,6 @@ instruct vround8D_mem(vec dst, memory mem, immU8 rmode) %{
%}
ins_pipe( pipe_slow );
%}
-#endif // _LP64
instruct onspinwait() %{
match(OnSpinWait);
@@ -4259,7 +4089,6 @@ instruct vgather_subwordGT8B_off(vec dst, memory mem, rRegP idx_base, rRegI offs
%}
-#ifdef _LP64
instruct vgather_masked_subwordLE8B_avx3(vec dst, memory mem, rRegP idx_base, immI_0 offset, kReg mask, rRegL mask_idx, rRegP tmp, rRegI rtmp, rRegL rtmp2, rFlagsReg cr) %{
predicate(VM_Version::supports_avx512bw() && is_subword_type(Matcher::vector_element_basic_type(n)) && Matcher::vector_length_in_bytes(n) <= 8);
match(Set dst (LoadVectorGatherMasked mem (Binary idx_base (Binary mask offset))));
@@ -4422,7 +4251,6 @@ instruct vgather_masked_subwordGT8B_off_avx2(vec dst, memory mem, rRegP idx_base
%}
ins_pipe( pipe_slow );
%}
-#endif
// ====================Scatter=======================================
@@ -4538,7 +4366,6 @@ instruct vReplS_reg(vec dst, rRegI src) %{
ins_pipe( pipe_slow );
%}
-#ifdef _LP64
instruct ReplHF_imm(vec dst, immH con, rRegI rtmp) %{
match(Set dst (Replicate con));
effect(TEMP rtmp);
@@ -4565,7 +4392,6 @@ instruct ReplHF_reg(vec dst, regF src, rRegI rtmp) %{
%}
ins_pipe( pipe_slow );
%}
-#endif
instruct ReplS_mem(vec dst, memory mem) %{
predicate(UseAVX >= 2 && Matcher::vector_element_basic_type(n) == T_SHORT);
@@ -4650,7 +4476,7 @@ instruct ReplI_zero(vec dst, immI_0 zero) %{
%}
instruct ReplI_M1(vec dst, immI_M1 con) %{
- predicate(UseSSE >= 2 && Matcher::is_non_long_integral_vector(n));
+ predicate(Matcher::is_non_long_integral_vector(n));
match(Set dst (Replicate con));
format %{ "vallones $dst" %}
ins_encode %{
@@ -4662,7 +4488,6 @@ instruct ReplI_M1(vec dst, immI_M1 con) %{
// ====================ReplicateL=======================================
-#ifdef _LP64
// Replicate long (8 byte) scalar to be vector
instruct ReplL_reg(vec dst, rRegL src) %{
predicate(Matcher::vector_element_basic_type(n) == T_LONG);
@@ -4683,61 +4508,6 @@ instruct ReplL_reg(vec dst, rRegL src) %{
%}
ins_pipe( pipe_slow );
%}
-#else // _LP64
-// Replicate long (8 byte) scalar to be vector
-instruct ReplL_reg(vec dst, eRegL src, vec tmp) %{
- predicate(Matcher::vector_length(n) <= 4 && Matcher::vector_element_basic_type(n) == T_LONG);
- match(Set dst (Replicate src));
- effect(TEMP dst, USE src, TEMP tmp);
- format %{ "replicateL $dst,$src" %}
- ins_encode %{
- uint vlen = Matcher::vector_length(this);
- if (vlen == 2) {
- __ movdl($dst$$XMMRegister, $src$$Register);
- __ movdl($tmp$$XMMRegister, HIGH_FROM_LOW($src$$Register));
- __ punpckldq($dst$$XMMRegister, $tmp$$XMMRegister);
- __ punpcklqdq($dst$$XMMRegister, $dst$$XMMRegister);
- } else if (VM_Version::supports_avx512vl()) { // AVX512VL for <512bit operands
- int vlen_enc = Assembler::AVX_256bit;
- __ movdl($dst$$XMMRegister, $src$$Register);
- __ movdl($tmp$$XMMRegister, HIGH_FROM_LOW($src$$Register));
- __ punpckldq($dst$$XMMRegister, $tmp$$XMMRegister);
- __ vpbroadcastq($dst$$XMMRegister, $dst$$XMMRegister, vlen_enc);
- } else {
- __ movdl($dst$$XMMRegister, $src$$Register);
- __ movdl($tmp$$XMMRegister, HIGH_FROM_LOW($src$$Register));
- __ punpckldq($dst$$XMMRegister, $tmp$$XMMRegister);
- __ punpcklqdq($dst$$XMMRegister, $dst$$XMMRegister);
- __ vinserti128_high($dst$$XMMRegister, $dst$$XMMRegister);
- }
- %}
- ins_pipe( pipe_slow );
-%}
-
-instruct ReplL_reg_leg(legVec dst, eRegL src, legVec tmp) %{
- predicate(Matcher::vector_length(n) == 8 && Matcher::vector_element_basic_type(n) == T_LONG);
- match(Set dst (Replicate src));
- effect(TEMP dst, USE src, TEMP tmp);
- format %{ "replicateL $dst,$src" %}
- ins_encode %{
- if (VM_Version::supports_avx512vl()) {
- __ movdl($dst$$XMMRegister, $src$$Register);
- __ movdl($tmp$$XMMRegister, HIGH_FROM_LOW($src$$Register));
- __ punpckldq($dst$$XMMRegister, $tmp$$XMMRegister);
- __ punpcklqdq($dst$$XMMRegister, $dst$$XMMRegister);
- __ vinserti128_high($dst$$XMMRegister, $dst$$XMMRegister);
- __ vinserti64x4($dst$$XMMRegister, $dst$$XMMRegister, $dst$$XMMRegister, 0x1);
- } else {
- int vlen_enc = Assembler::AVX_512bit;
- __ movdl($dst$$XMMRegister, $src$$Register);
- __ movdl($tmp$$XMMRegister, HIGH_FROM_LOW($src$$Register));
- __ punpckldq($dst$$XMMRegister, $tmp$$XMMRegister);
- __ vpbroadcastq($dst$$XMMRegister, $dst$$XMMRegister, vlen_enc);
- }
- %}
- ins_pipe( pipe_slow );
-%}
-#endif // _LP64
instruct ReplL_mem(vec dst, memory mem) %{
predicate(Matcher::vector_element_basic_type(n) == T_LONG);
@@ -4786,7 +4556,7 @@ instruct ReplL_zero(vec dst, immL0 zero) %{
%}
instruct ReplL_M1(vec dst, immL_M1 con) %{
- predicate(UseSSE >= 2 && Matcher::vector_element_basic_type(n) == T_LONG);
+ predicate(Matcher::vector_element_basic_type(n) == T_LONG);
match(Set dst (Replicate con));
format %{ "vallones $dst" %}
ins_encode %{
@@ -5011,7 +4781,6 @@ instruct insert64(vec dst, vec src, rRegI val, immU8 idx, legVec vtmp) %{
ins_pipe( pipe_slow );
%}
-#ifdef _LP64
instruct insert2L(vec dst, rRegL val, immU8 idx) %{
predicate(Matcher::vector_length(n) == 2);
match(Set dst (VectorInsert (Binary dst val) idx));
@@ -5062,7 +4831,6 @@ instruct insert8L(vec dst, vec src, rRegL val, immU8 idx, legVec vtmp) %{
%}
ins_pipe( pipe_slow );
%}
-#endif
instruct insertF(vec dst, regF val, immU8 idx) %{
predicate(Matcher::vector_length(n) < 8);
@@ -5108,7 +4876,6 @@ instruct vinsertF(vec dst, vec src, regF val, immU8 idx, vec vtmp) %{
ins_pipe( pipe_slow );
%}
-#ifdef _LP64
instruct insert2D(vec dst, regD val, immU8 idx, rRegL tmp) %{
predicate(Matcher::vector_length(n) == 2);
match(Set dst (VectorInsert (Binary dst val) idx));
@@ -5163,7 +4930,6 @@ instruct insert8D(vec dst, vec src, regD val, immI idx, rRegL tmp, legVec vtmp)
%}
ins_pipe( pipe_slow );
%}
-#endif
// ====================REDUCTION ARITHMETIC=======================================
@@ -5190,7 +4956,6 @@ instruct reductionI(rRegI dst, rRegI src1, legVec src2, legVec vtmp1, legVec vtm
// =======================Long Reduction==========================================
-#ifdef _LP64
instruct reductionL(rRegL dst, rRegL src1, legVec src2, legVec vtmp1, legVec vtmp2) %{
predicate(Matcher::vector_element_basic_type(n->in(2)) == T_LONG && !VM_Version::supports_avx512dq());
match(Set dst (AddReductionVL src1 src2));
@@ -5228,7 +4993,6 @@ instruct reductionL_avx512dq(rRegL dst, rRegL src1, vec src2, vec vtmp1, vec vtm
%}
ins_pipe( pipe_slow );
%}
-#endif // _LP64
// =======================Float Reduction==========================================
@@ -5440,7 +5204,6 @@ instruct unordered_reduction8D(regD dst, regD src1, legVec src2, legVec vtmp1, l
// =======================Byte Reduction==========================================
-#ifdef _LP64
instruct reductionB(rRegI dst, rRegI src1, legVec src2, legVec vtmp1, legVec vtmp2) %{
predicate(Matcher::vector_element_basic_type(n->in(2)) == T_BYTE && !VM_Version::supports_avx512bw());
match(Set dst (AddReductionVI src1 src2));
@@ -5476,7 +5239,6 @@ instruct reductionB_avx512bw(rRegI dst, rRegI src1, vec src2, vec vtmp1, vec vtm
%}
ins_pipe( pipe_slow );
%}
-#endif
// =======================Short Reduction==========================================
@@ -6777,7 +6539,6 @@ instruct signumV_reg_evex(vec dst, vec src, vec zero, vec one, kReg ktmp1) %{
// Result going from high bit to low bit is 0x11100100 = 0xe4
// ---------------------------------------
-#ifdef _LP64
instruct copySignF_reg(regF dst, regF src, regF tmp1, rRegI tmp2) %{
match(Set dst (CopySignF dst src));
effect(TEMP tmp1, TEMP tmp2);
@@ -6803,8 +6564,6 @@ instruct copySignD_imm(regD dst, regD src, regD tmp1, rRegL tmp2, immD zero) %{
ins_pipe( pipe_slow );
%}
-#endif // _LP64
-
//----------------------------- CompressBits/ExpandBits ------------------------
instruct compressBitsI_reg(rRegI dst, rRegI src, rRegI mask) %{
@@ -7169,7 +6928,6 @@ instruct vshiftL_arith_reg(vec dst, vec src, vec shift, vec tmp) %{
ins_encode %{
uint vlen = Matcher::vector_length(this);
if (vlen == 2) {
- assert(UseSSE >= 2, "required");
__ movdqu($dst$$XMMRegister, $src$$XMMRegister);
__ psrlq($dst$$XMMRegister, $shift$$XMMRegister);
__ movdqu($tmp$$XMMRegister, ExternalAddress(vector_long_sign_mask()), noreg);
@@ -7977,7 +7735,6 @@ instruct vucast(vec dst, vec src) %{
ins_pipe( pipe_slow );
%}
-#ifdef _LP64
instruct vround_float_avx(vec dst, vec src, rRegP tmp, vec xtmp1, vec xtmp2, vec xtmp3, vec xtmp4, rFlagsReg cr) %{
predicate(!VM_Version::supports_avx512vl() &&
Matcher::vector_length_in_bytes(n) < 64 &&
@@ -8027,8 +7784,6 @@ instruct vround_reg_evex(vec dst, vec src, rRegP tmp, vec xtmp1, vec xtmp2, kReg
ins_pipe( pipe_slow );
%}
-#endif // _LP64
-
// --------------------------------- VectorMaskCmp --------------------------------------
instruct vcmpFD(legVec dst, legVec src1, legVec src2, immI8 cond) %{
@@ -8238,9 +7993,7 @@ instruct extractI(rRegI dst, legVec src, immU8 idx) %{
predicate(Matcher::vector_length_in_bytes(n->in(1)) <= 16); // src
match(Set dst (ExtractI src idx));
match(Set dst (ExtractS src idx));
-#ifdef _LP64
match(Set dst (ExtractB src idx));
-#endif
format %{ "extractI $dst,$src,$idx\t!" %}
ins_encode %{
assert($idx$$constant < (int)Matcher::vector_length(this, $src), "out of bounds");
@@ -8256,9 +8009,7 @@ instruct vextractI(rRegI dst, legVec src, immI idx, legVec vtmp) %{
Matcher::vector_length_in_bytes(n->in(1)) == 64); // src
match(Set dst (ExtractI src idx));
match(Set dst (ExtractS src idx));
-#ifdef _LP64
match(Set dst (ExtractB src idx));
-#endif
effect(TEMP vtmp);
format %{ "vextractI $dst,$src,$idx\t! using $vtmp as TEMP" %}
ins_encode %{
@@ -8271,7 +8022,6 @@ instruct vextractI(rRegI dst, legVec src, immI idx, legVec vtmp) %{
ins_pipe( pipe_slow );
%}
-#ifdef _LP64
instruct extractL(rRegL dst, legVec src, immU8 idx) %{
predicate(Matcher::vector_length(n->in(1)) <= 2); // src
match(Set dst (ExtractL src idx));
@@ -8299,7 +8049,6 @@ instruct vextractL(rRegL dst, legVec src, immU8 idx, legVec vtmp) %{
%}
ins_pipe( pipe_slow );
%}
-#endif
instruct extractF(legRegF dst, legVec src, immU8 idx, legVec vtmp) %{
predicate(Matcher::vector_length(n->in(1)) <= 4);
@@ -8552,7 +8301,6 @@ instruct vabsnegD(vec dst, vec src) %{
int opcode = this->ideal_Opcode();
uint vlen = Matcher::vector_length(this);
if (vlen == 2) {
- assert(UseSSE >= 2, "required");
__ vabsnegd(opcode, $dst$$XMMRegister, $src$$XMMRegister);
} else {
int vlen_enc = vector_length_encoding(this);
@@ -8564,7 +8312,6 @@ instruct vabsnegD(vec dst, vec src) %{
//------------------------------------- VectorTest --------------------------------------------
-#ifdef _LP64
instruct vptest_lt16(rFlagsRegU cr, legVec src1, legVec src2, legVec vtmp) %{
predicate(Matcher::vector_length_in_bytes(n->in(1)) < 16);
match(Set cr (VectorTest src1 src2));
@@ -8632,7 +8379,6 @@ instruct ktest_ge8(rFlagsRegU cr, kReg src1, kReg src2) %{
%}
ins_pipe( pipe_slow );
%}
-#endif
//------------------------------------- LoadMask --------------------------------------------
@@ -8883,7 +8629,6 @@ instruct loadIotaIndices(vec dst, immI_0 src) %{
ins_pipe( pipe_slow );
%}
-#ifdef _LP64
instruct VectorPopulateIndex(vec dst, rRegI src1, immI_1 src2, vec vtmp) %{
match(Set dst (PopulateIndex src1 src2));
effect(TEMP dst, TEMP vtmp);
@@ -8915,7 +8660,7 @@ instruct VectorPopulateLIndex(vec dst, rRegL src1, immI_1 src2, vec vtmp) %{
%}
ins_pipe( pipe_slow );
%}
-#endif
+
//-------------------------------- Rearrange ----------------------------------
// LoadShuffle/Rearrange for Byte
@@ -9496,7 +9241,6 @@ instruct vmasked_store_evex(memory mem, vec src, kReg mask) %{
ins_pipe( pipe_slow );
%}
-#ifdef _LP64
instruct verify_vector_alignment(rRegP addr, immL32 mask, rFlagsReg cr) %{
match(Set addr (VerifyVectorAlignment addr mask));
effect(KILL cr);
@@ -9710,7 +9454,6 @@ instruct vmask_first_or_last_true_avx(rRegI dst, vec mask, immI size, rRegL tmp,
%}
// --------------------------------- Compress/Expand Operations ---------------------------
-#ifdef _LP64
instruct vcompress_reg_avx(vec dst, vec src, vec mask, rRegI rtmp, rRegL rscratch, vec perm, vec xtmp, rFlagsReg cr) %{
predicate(!VM_Version::supports_avx512vl() && Matcher::vector_length_in_bytes(n) <= 32);
match(Set dst (CompressV src mask));
@@ -9726,7 +9469,6 @@ instruct vcompress_reg_avx(vec dst, vec src, vec mask, rRegI rtmp, rRegL rscratc
%}
ins_pipe( pipe_slow );
%}
-#endif
instruct vcompress_expand_reg_evex(vec dst, vec src, kReg mask) %{
predicate(VM_Version::supports_avx512vl() || Matcher::vector_length_in_bytes(n) == 64);
@@ -9754,8 +9496,6 @@ instruct vcompress_mask_reg_evex(kReg dst, kReg mask, rRegL rtmp1, rRegL rtmp2,
ins_pipe( pipe_slow );
%}
-#endif // _LP64
-
// -------------------------------- Bit and Byte Reversal Vector Operations ------------------------
instruct vreverse_reg(vec dst, vec src, vec xtmp1, vec xtmp2, rRegI rtmp) %{
@@ -10476,7 +10216,6 @@ instruct mask_all_evexI_LE32(kReg dst, rRegI src) %{
ins_pipe( pipe_slow );
%}
-#ifdef _LP64
instruct mask_not_immLT8(kReg dst, kReg src, rRegI rtmp, kReg ktmp, immI_M1 cnt) %{
predicate(Matcher::vector_length(n) < 8 && VM_Version::supports_avx512dq());
match(Set dst (XorVMask src (MaskAll cnt)));
@@ -10541,7 +10280,6 @@ instruct long_to_mask_evex(kReg dst, rRegL src) %{
%}
ins_pipe( pipe_slow );
%}
-#endif
instruct mask_opers_evex(kReg dst, kReg src1, kReg src2, kReg kscratch) %{
match(Set dst (AndVMask src1 src2));
@@ -10894,6 +10632,16 @@ instruct reinterpretS2HF(regF dst, rRegI src)
ins_pipe(pipe_slow);
%}
+instruct reinterpretHF2S(rRegI dst, regF src)
+%{
+ match(Set dst (ReinterpretHF2S src));
+ format %{ "vmovw $dst, $src" %}
+ ins_encode %{
+ __ vmovw($dst$$Register, $src$$XMMRegister);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
instruct convF2HFAndS2HF(regF dst, regF src)
%{
match(Set dst (ReinterpretS2HF (ConvF2HF src)));
@@ -10914,16 +10662,6 @@ instruct convHF2SAndHF2F(regF dst, regF src)
ins_pipe(pipe_slow);
%}
-instruct reinterpretHF2S(rRegI dst, regF src)
-%{
- match(Set dst (ReinterpretHF2S src));
- format %{ "vmovw $dst, $src" %}
- ins_encode %{
- __ vmovw($dst$$Register, $src$$XMMRegister);
- %}
- ins_pipe(pipe_slow);
-%}
-
instruct scalar_sqrt_HF_reg(regF dst, regF src)
%{
match(Set dst (SqrtHF src));
@@ -10957,7 +10695,7 @@ instruct scalar_minmax_HF_reg(regF dst, regF src1, regF src2, kReg ktmp, regF xt
ins_encode %{
int opcode = this->ideal_Opcode();
__ scalar_max_min_fp16(opcode, $dst$$XMMRegister, $src1$$XMMRegister, $src2$$XMMRegister, $ktmp$$KRegister,
- $xtmp1$$XMMRegister, $xtmp2$$XMMRegister, Assembler::AVX_128bit);
+ $xtmp1$$XMMRegister, $xtmp2$$XMMRegister);
%}
ins_pipe( pipe_slow );
%}
@@ -10972,3 +10710,94 @@ instruct scalar_fma_HF_reg(regF dst, regF src1, regF src2)
%}
ins_pipe( pipe_slow );
%}
+
+
+instruct vector_sqrt_HF_reg(vec dst, vec src)
+%{
+ match(Set dst (SqrtVHF src));
+ format %{ "vector_sqrt_fp16 $dst, $src" %}
+ ins_encode %{
+ int vlen_enc = vector_length_encoding(this);
+ __ evsqrtph($dst$$XMMRegister, $src$$XMMRegister, vlen_enc);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct vector_sqrt_HF_mem(vec dst, memory src)
+%{
+ match(Set dst (SqrtVHF (VectorReinterpret (LoadVector src))));
+ format %{ "vector_sqrt_fp16_mem $dst, $src" %}
+ ins_encode %{
+ int vlen_enc = vector_length_encoding(this);
+ __ evsqrtph($dst$$XMMRegister, $src$$Address, vlen_enc);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct vector_binOps_HF_reg(vec dst, vec src1, vec src2)
+%{
+ match(Set dst (AddVHF src1 src2));
+ match(Set dst (DivVHF src1 src2));
+ match(Set dst (MulVHF src1 src2));
+ match(Set dst (SubVHF src1 src2));
+ format %{ "vector_binop_fp16 $dst, $src1, $src2" %}
+ ins_encode %{
+ int vlen_enc = vector_length_encoding(this);
+ int opcode = this->ideal_Opcode();
+ __ evfp16ph(opcode, $dst$$XMMRegister, $src1$$XMMRegister, $src2$$XMMRegister, vlen_enc);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+
+instruct vector_binOps_HF_mem(vec dst, vec src1, memory src2)
+%{
+ match(Set dst (AddVHF src1 (VectorReinterpret (LoadVector src2))));
+ match(Set dst (DivVHF src1 (VectorReinterpret (LoadVector src2))));
+ match(Set dst (MulVHF src1 (VectorReinterpret (LoadVector src2))));
+ match(Set dst (SubVHF src1 (VectorReinterpret (LoadVector src2))));
+ format %{ "vector_binop_fp16_mem $dst, $src1, $src2" %}
+ ins_encode %{
+ int vlen_enc = vector_length_encoding(this);
+ int opcode = this->ideal_Opcode();
+ __ evfp16ph(opcode, $dst$$XMMRegister, $src1$$XMMRegister, $src2$$Address, vlen_enc);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct vector_fma_HF_reg(vec dst, vec src1, vec src2)
+%{
+ match(Set dst (FmaVHF src2 (Binary dst src1)));
+ format %{ "vector_fma_fp16 $dst, $src1, $src2\t# $dst = $dst * $src1 + $src2 fma packedH" %}
+ ins_encode %{
+ int vlen_enc = vector_length_encoding(this);
+ __ evfmadd132ph($dst$$XMMRegister, $src2$$XMMRegister, $src1$$XMMRegister, vlen_enc);
+ %}
+ ins_pipe( pipe_slow );
+%}
+
+instruct vector_fma_HF_mem(vec dst, memory src1, vec src2)
+%{
+ match(Set dst (FmaVHF src2 (Binary dst (VectorReinterpret (LoadVector src1)))));
+ format %{ "vector_fma_fp16_mem $dst, $src1, $src2\t# $dst = $dst * $src1 + $src2 fma packedH" %}
+ ins_encode %{
+ int vlen_enc = vector_length_encoding(this);
+ __ evfmadd132ph($dst$$XMMRegister, $src2$$XMMRegister, $src1$$Address, vlen_enc);
+ %}
+ ins_pipe( pipe_slow );
+%}
+
+instruct vector_minmax_HF_reg(vec dst, vec src1, vec src2, kReg ktmp, vec xtmp1, vec xtmp2)
+%{
+ match(Set dst (MinVHF src1 src2));
+ match(Set dst (MaxVHF src1 src2));
+ effect(TEMP_DEF dst, TEMP ktmp, TEMP xtmp1, TEMP xtmp2);
+ format %{ "vector_min_max_fp16 $dst, $src1, $src2\t using $ktmp, $xtmp1 and $xtmp2 as TEMP" %}
+ ins_encode %{
+ int vlen_enc = vector_length_encoding(this);
+ int opcode = this->ideal_Opcode();
+ __ vector_max_min_fp16(opcode, $dst$$XMMRegister, $src2$$XMMRegister, $src1$$XMMRegister, $ktmp$$KRegister,
+ $xtmp1$$XMMRegister, $xtmp2$$XMMRegister, vlen_enc);
+ %}
+ ins_pipe( pipe_slow );
+%}
diff --git a/src/hotspot/cpu/x86/x86_64.ad b/src/hotspot/cpu/x86/x86_64.ad
index b94ff7dbd9e5a..25cee7a3094cd 100644
--- a/src/hotspot/cpu/x86/x86_64.ad
+++ b/src/hotspot/cpu/x86/x86_64.ad
@@ -422,6 +422,18 @@ source_hpp %{
#include "peephole_x86_64.hpp"
+bool castLL_is_imm32(const Node* n);
+
+%}
+
+source %{
+
+bool castLL_is_imm32(const Node* n) {
+ assert(n->is_CastLL(), "must be a CastLL");
+ const TypeLong* t = n->bottom_type()->is_long();
+ return (t->_lo == min_jlong || Assembler::is_simm32(t->_lo)) && (t->_hi == max_jlong || Assembler::is_simm32(t->_hi));
+}
+
%}
// Register masks
@@ -845,7 +857,7 @@ void MachPrologNode::emit(C2_MacroAssembler *masm, PhaseRegAlloc *ra_) const {
Register klass = rscratch1;
__ mov_metadata(klass, C->method()->holder()->constant_encoding());
- __ clinit_barrier(klass, r15_thread, &L_skip_barrier /*L_fast_path*/);
+ __ clinit_barrier(klass, &L_skip_barrier /*L_fast_path*/);
__ jump(RuntimeAddress(SharedRuntime::get_handle_wrong_method_stub())); // slow path
@@ -943,7 +955,7 @@ void MachEpilogNode::emit(C2_MacroAssembler* masm, PhaseRegAlloc* ra_) const
code_stub = &stub->entry();
}
__ relocate(relocInfo::poll_return_type);
- __ safepoint_poll(*code_stub, r15_thread, true /* at_return */, true /* in_nmethod */);
+ __ safepoint_poll(*code_stub, true /* at_return */, true /* in_nmethod */);
}
}
@@ -1584,14 +1596,11 @@ uint MachUEPNode::size(PhaseRegAlloc* ra_) const
//=============================================================================
bool Matcher::supports_vector_calling_convention(void) {
- if (EnableVectorSupport && UseVectorStubs) {
- return true;
- }
- return false;
+ return EnableVectorSupport;
}
OptoRegPair Matcher::vector_return_value(uint ideal_reg) {
- assert(EnableVectorSupport && UseVectorStubs, "sanity");
+ assert(EnableVectorSupport, "sanity");
int lo = XMM0_num;
int hi = XMM0b_num;
if (ideal_reg == Op_VecX) hi = XMM0d_num;
@@ -1838,14 +1847,14 @@ encode %{
%}
enc_class clear_avx %{
- debug_only(int off0 = __ offset());
+ DEBUG_ONLY(int off0 = __ offset());
if (generate_vzeroupper(Compile::current())) {
// Clear upper bits of YMM registers to avoid AVX <-> SSE transition penalty
// Clear upper bits of YMM registers when current compiled code uses
// wide vectors to avoid AVX <-> SSE transition penalty during call.
__ vzeroupper();
}
- debug_only(int off1 = __ offset());
+ DEBUG_ONLY(int off1 = __ offset());
assert(off1 - off0 == clear_avx_size(), "correct size prediction");
%}
@@ -7396,7 +7405,7 @@ instruct addL_mem_imm(memory dst, immL32 src, rFlagsReg cr)
ins_pipe(ialu_mem_imm);
%}
-instruct incL_rReg(rRegI dst, immL1 src, rFlagsReg cr)
+instruct incL_rReg(rRegL dst, immL1 src, rFlagsReg cr)
%{
predicate(!UseAPX && UseIncDec);
match(Set dst (AddL dst src));
@@ -7409,7 +7418,7 @@ instruct incL_rReg(rRegI dst, immL1 src, rFlagsReg cr)
ins_pipe(ialu_reg);
%}
-instruct incL_rReg_ndd(rRegI dst, rRegI src, immL1 val, rFlagsReg cr)
+instruct incL_rReg_ndd(rRegL dst, rRegI src, immL1 val, rFlagsReg cr)
%{
predicate(UseAPX && UseIncDec);
match(Set dst (AddL src val));
@@ -7422,7 +7431,7 @@ instruct incL_rReg_ndd(rRegI dst, rRegI src, immL1 val, rFlagsReg cr)
ins_pipe(ialu_reg);
%}
-instruct incL_rReg_mem_ndd(rRegI dst, memory src, immL1 val, rFlagsReg cr)
+instruct incL_rReg_mem_ndd(rRegL dst, memory src, immL1 val, rFlagsReg cr)
%{
predicate(UseAPX && UseIncDec);
match(Set dst (AddL (LoadL src) val));
@@ -7605,6 +7614,7 @@ instruct castPP(rRegP dst)
instruct castII(rRegI dst)
%{
+ predicate(VerifyConstraintCasts == 0);
match(Set dst (CastII dst));
size(0);
@@ -7614,8 +7624,22 @@ instruct castII(rRegI dst)
ins_pipe(empty);
%}
+instruct castII_checked(rRegI dst, rFlagsReg cr)
+%{
+ predicate(VerifyConstraintCasts > 0);
+ match(Set dst (CastII dst));
+
+ effect(KILL cr);
+ format %{ "# cast_checked_II $dst" %}
+ ins_encode %{
+ __ verify_int_in_range(_idx, bottom_type()->is_int(), $dst$$Register);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
instruct castLL(rRegL dst)
%{
+ predicate(VerifyConstraintCasts == 0);
match(Set dst (CastLL dst));
size(0);
@@ -7625,6 +7649,32 @@ instruct castLL(rRegL dst)
ins_pipe(empty);
%}
+instruct castLL_checked_L32(rRegL dst, rFlagsReg cr)
+%{
+ predicate(VerifyConstraintCasts > 0 && castLL_is_imm32(n));
+ match(Set dst (CastLL dst));
+
+ effect(KILL cr);
+ format %{ "# cast_checked_LL $dst" %}
+ ins_encode %{
+ __ verify_long_in_range(_idx, bottom_type()->is_long(), $dst$$Register, noreg);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
+instruct castLL_checked(rRegL dst, rRegL tmp, rFlagsReg cr)
+%{
+ predicate(VerifyConstraintCasts > 0 && !castLL_is_imm32(n));
+ match(Set dst (CastLL dst));
+
+ effect(KILL cr, TEMP tmp);
+ format %{ "# cast_checked_LL $dst\tusing $tmp as TEMP" %}
+ ins_encode %{
+ __ verify_long_in_range(_idx, bottom_type()->is_long(), $dst$$Register, $tmp$$Register);
+ %}
+ ins_pipe(pipe_slow);
+%}
+
instruct castFF(regF dst)
%{
match(Set dst (CastFF dst));
@@ -11345,7 +11395,7 @@ instruct xorL_rReg_mem_rReg_ndd(rRegL dst, memory src1, rRegL src2, rFlagsReg cr
ins_cost(150);
format %{ "exorq $dst, $src1, $src2\t# long ndd" %}
ins_encode %{
- __ exorq($dst$$Register, $src1$$Address, $src1$$Register, false);
+ __ exorq($dst$$Register, $src1$$Address, $src2$$Register, false);
%}
ins_pipe(ialu_reg_mem);
%}
diff --git a/src/hotspot/cpu/zero/icache_zero.hpp b/src/hotspot/cpu/zero/icache_zero.hpp
index b40e07d5e3b3c..781021a2b20ef 100644
--- a/src/hotspot/cpu/zero/icache_zero.hpp
+++ b/src/hotspot/cpu/zero/icache_zero.hpp
@@ -33,7 +33,7 @@
class ICache : public AbstractICache {
public:
- static void initialize() {}
+ static void initialize(int phase) {}
static void invalidate_word(address addr) {}
static void invalidate_range(address start, int nbytes) {}
};
diff --git a/src/hotspot/cpu/zero/sharedRuntime_zero.cpp b/src/hotspot/cpu/zero/sharedRuntime_zero.cpp
index f141135ff9571..60a873ab31f01 100644
--- a/src/hotspot/cpu/zero/sharedRuntime_zero.cpp
+++ b/src/hotspot/cpu/zero/sharedRuntime_zero.cpp
@@ -50,18 +50,17 @@ int SharedRuntime::java_calling_convention(const BasicType *sig_bt,
return 0;
}
-AdapterHandlerEntry* SharedRuntime::generate_i2c2i_adapters(
- MacroAssembler *masm,
- int total_args_passed,
- int comp_args_on_stack,
- const BasicType *sig_bt,
- const VMRegPair *regs,
- AdapterFingerPrint *fingerprint) {
- return AdapterHandlerLibrary::new_entry(
- fingerprint,
- CAST_FROM_FN_PTR(address,zero_null_code_stub),
- CAST_FROM_FN_PTR(address,zero_null_code_stub),
- CAST_FROM_FN_PTR(address,zero_null_code_stub));
+void SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm,
+ int total_args_passed,
+ int comp_args_on_stack,
+ const BasicType *sig_bt,
+ const VMRegPair *regs,
+ AdapterHandlerEntry* handler) {
+ handler->set_entry_points(CAST_FROM_FN_PTR(address,zero_null_code_stub),
+ CAST_FROM_FN_PTR(address,zero_null_code_stub),
+ CAST_FROM_FN_PTR(address,zero_null_code_stub),
+ nullptr);
+ return;
}
nmethod *SharedRuntime::generate_native_wrapper(MacroAssembler *masm,
diff --git a/src/hotspot/cpu/zero/vm_version_zero.cpp b/src/hotspot/cpu/zero/vm_version_zero.cpp
index e38561e19c571..3ce9227c1939c 100644
--- a/src/hotspot/cpu/zero/vm_version_zero.cpp
+++ b/src/hotspot/cpu/zero/vm_version_zero.cpp
@@ -151,6 +151,6 @@ void VM_Version::initialize_cpu_information(void) {
_no_of_threads = _no_of_cores;
_no_of_sockets = _no_of_cores;
snprintf(_cpu_name, CPU_TYPE_DESC_BUF_SIZE - 1, "Zero VM");
- snprintf(_cpu_desc, CPU_DETAILED_DESC_BUF_SIZE, "%s", _features_string);
+ snprintf(_cpu_desc, CPU_DETAILED_DESC_BUF_SIZE, "%s", _cpu_info_string);
_initialized = true;
}
diff --git a/src/hotspot/os/aix/attachListener_aix.cpp b/src/hotspot/os/aix/attachListener_aix.cpp
index 218ee04fdcc0e..e5101814f9771 100644
--- a/src/hotspot/os/aix/attachListener_aix.cpp
+++ b/src/hotspot/os/aix/attachListener_aix.cpp
@@ -73,16 +73,7 @@ class AixAttachListener: AllStatic {
static bool _atexit_registered;
- // reads a request from the given connected socket
- static AixAttachOperation* read_request(int s);
-
public:
- enum {
- ATTACH_PROTOCOL_VER = 1 // protocol version
- };
- enum {
- ATTACH_ERROR_BADVERSION = 101 // error codes
- };
static void set_path(char* path) {
if (path == nullptr) {
@@ -107,25 +98,65 @@ class AixAttachListener: AllStatic {
static void set_shutdown(bool shutdown) { _shutdown = shutdown; }
static bool is_shutdown() { return _shutdown; }
- // write the given buffer to a socket
- static int write_fully(int s, char* buf, size_t len);
-
static AixAttachOperation* dequeue();
};
+class SocketChannel : public AttachOperation::RequestReader, public AttachOperation::ReplyWriter {
+private:
+ int _socket;
+public:
+ SocketChannel(int socket) : _socket(socket) {}
+ ~SocketChannel() {
+ close();
+ }
+
+ bool opened() const {
+ return _socket != -1;
+ }
+
+ void close() {
+ if (opened()) {
+ // SHUT_RDWR is not available
+ ::shutdown(_socket, 2);
+ ::close(_socket);
+ _socket = -1;
+ }
+ }
+
+ // RequestReader
+ int read(void* buffer, int size) override {
+ ssize_t n;
+ RESTARTABLE(::read(_socket, buffer, (size_t)size), n);
+ return checked_cast(n);
+ }
+
+ // ReplyWriter
+ int write(const void* buffer, int size) override {
+ ssize_t n;
+ RESTARTABLE(::write(_socket, buffer, size), n);
+ return checked_cast(n);
+ }
+
+ void flush() override {
+ }
+};
+
class AixAttachOperation: public AttachOperation {
private:
// the connection to the client
- int _socket;
+ SocketChannel _socket_channel;
public:
- void complete(jint res, bufferedStream* st);
+ AixAttachOperation(int socket) : AttachOperation(), _socket_channel(socket) {}
- void set_socket(int s) { _socket = s; }
- int socket() const { return _socket; }
+ void complete(jint res, bufferedStream* st) override;
- AixAttachOperation(char* name) : AttachOperation(name) {
- set_socket(-1);
+ ReplyWriter* get_reply_writer() override {
+ return &_socket_channel;
+ }
+
+ bool read_request() {
+ return _socket_channel.read_request(this, &_socket_channel);
}
};
@@ -137,34 +168,6 @@ bool AixAttachListener::_atexit_registered = false;
// Shutdown marker to prevent accept blocking during clean-up
volatile bool AixAttachListener::_shutdown = false;
-// Supporting class to help split a buffer into individual components
-class ArgumentIterator : public StackObj {
- private:
- char* _pos;
- char* _end;
- public:
- ArgumentIterator(char* arg_buffer, size_t arg_size) {
- _pos = arg_buffer;
- _end = _pos + arg_size - 1;
- }
- char* next() {
- if (*_pos == '\0') {
- // advance the iterator if possible (null arguments)
- if (_pos < _end) {
- _pos += 1;
- }
- return nullptr;
- }
- char* res = _pos;
- char* next_pos = strchr(_pos, '\0');
- if (next_pos < _end) {
- next_pos++;
- }
- _pos = next_pos;
- return res;
- }
-};
-
// On AIX if sockets block until all data has been transmitted
// successfully in some communication domains a socket "close" may
// never complete. We have to take care that after the socket shutdown
@@ -258,106 +261,6 @@ int AixAttachListener::init() {
return 0;
}
-// Given a socket that is connected to a peer we read the request and
-// create an AttachOperation. As the socket is blocking there is potential
-// for a denial-of-service if the peer does not response. However this happens
-// after the peer credentials have been checked and in the worst case it just
-// means that the attach listener thread is blocked.
-//
-AixAttachOperation* AixAttachListener::read_request(int s) {
- char ver_str[8];
- os::snprintf_checked(ver_str, sizeof(ver_str), "%d", ATTACH_PROTOCOL_VER);
-
- // The request is a sequence of strings so we first figure out the
- // expected count and the maximum possible length of the request.
- // The request is:
- // 00000
- // where is the protocol version (1), is the command
- // name ("load", "datadump", ...), and is an argument
- int expected_str_count = 2 + AttachOperation::arg_count_max;
- const size_t max_len = (sizeof(ver_str) + 1) + (AttachOperation::name_length_max + 1) +
- AttachOperation::arg_count_max*(AttachOperation::arg_length_max + 1);
-
- char buf[max_len];
- int str_count = 0;
-
- // Read until all (expected) strings have been read, the buffer is
- // full, or EOF.
-
- size_t off = 0;
- size_t left = max_len;
-
- do {
- ssize_t n;
- // Don't block on interrupts because this will
- // hang in the clean-up when shutting down.
- n = read(s, buf+off, left);
- assert(n <= checked_cast(left), "buffer was too small, impossible!");
- buf[max_len - 1] = '\0';
- if (n == -1) {
- return nullptr; // reset by peer or other error
- }
- if (n == 0) {
- break;
- }
- for (int i=0; i so check it now to
- // check for protocol mismatch
- if (str_count == 1) {
- if ((strlen(buf) != strlen(ver_str)) ||
- (atoi(buf) != ATTACH_PROTOCOL_VER)) {
- char msg[32];
- os::snprintf_checked(msg, sizeof(msg), "%d\n", ATTACH_ERROR_BADVERSION);
- write_fully(s, msg, strlen(msg));
- return nullptr;
- }
- }
- }
- }
- off += n;
- left -= n;
- } while (left > 0 && str_count < expected_str_count);
-
- if (str_count != expected_str_count) {
- return nullptr; // incomplete request
- }
-
- // parse request
-
- ArgumentIterator args(buf, (max_len)-left);
-
- // version already checked
- char* v = args.next();
-
- char* name = args.next();
- if (name == nullptr || strlen(name) > AttachOperation::name_length_max) {
- return nullptr;
- }
-
- AixAttachOperation* op = new AixAttachOperation(name);
-
- for (int i=0; iset_arg(i, nullptr);
- } else {
- if (strlen(arg) > AttachOperation::arg_length_max) {
- delete op;
- return nullptr;
- }
- op->set_arg(i, arg);
- }
- }
-
- op->set_socket(s);
- return op;
-}
-
-
// Dequeue an operation
//
// In the Aix implementation there is only a single operation and clients
@@ -402,9 +305,9 @@ AixAttachOperation* AixAttachListener::dequeue() {
}
// peer credential look okay so we read the request
- AixAttachOperation* op = read_request(s);
- if (op == nullptr) {
- ::close(s);
+ AixAttachOperation* op = new AixAttachOperation(s);
+ if (!op->read_request()) {
+ delete op;
continue;
} else {
return op;
@@ -412,21 +315,6 @@ AixAttachOperation* AixAttachListener::dequeue() {
}
}
-// write the given buffer to the socket
-int AixAttachListener::write_fully(int s, char* buf, size_t len) {
- do {
- ssize_t n = ::write(s, buf, len);
- if (n == -1) {
- if (errno != EINTR) return -1;
- } else {
- buf += n;
- len -= n;
- }
- }
- while (len > 0);
- return 0;
-}
-
// Complete an operation by sending the operation result and any result
// output to the client. At this time the socket is in blocking mode so
// potentially we can block if there is a lot of data and the client is
@@ -436,24 +324,6 @@ int AixAttachListener::write_fully(int s, char* buf, size_t len) {
// socket could be made non-blocking and a timeout could be used.
void AixAttachOperation::complete(jint result, bufferedStream* st) {
- JavaThread* thread = JavaThread::current();
- ThreadBlockInVM tbivm(thread);
-
- // write operation result
- char msg[32];
- os::snprintf_checked(msg, sizeof(msg), "%d\n", result);
- int rc = AixAttachListener::write_fully(this->socket(), msg, strlen(msg));
-
- // write any result data
- if (rc == 0) {
- // Shutdown the socket in the cleanup function to enable more than
- // one agent attach in a sequence (see comments to listener_cleanup()).
- AixAttachListener::write_fully(this->socket(), (char*) st->base(), st->size());
- }
-
- // done
- ::close(this->socket());
-
delete this;
}
@@ -493,6 +363,7 @@ void AttachListener::vm_start() {
}
int AttachListener::pd_init() {
+ AttachListener::set_supported_version(ATTACH_API_V2);
JavaThread* thread = JavaThread::current();
ThreadBlockInVM tbivm(thread);
diff --git a/src/hotspot/os/aix/libodm_aix.cpp b/src/hotspot/os/aix/libodm_aix.cpp
index 9fe0fb7abd842..854fd5e2b79ba 100644
--- a/src/hotspot/os/aix/libodm_aix.cpp
+++ b/src/hotspot/os/aix/libodm_aix.cpp
@@ -30,6 +30,7 @@
#include
#include "runtime/arguments.hpp"
#include "runtime/os.hpp"
+#include "utilities/permitForbiddenFunctions.hpp"
dynamicOdm::dynamicOdm() {
@@ -59,7 +60,7 @@ dynamicOdm::~dynamicOdm() {
}
-void odmWrapper::clean_data() { if (_data) { free(_data); _data = nullptr; } }
+void odmWrapper::clean_data() { if (_data) { permit_forbidden_function::free(_data); _data = nullptr; } }
int odmWrapper::class_offset(const char *field, bool is_aix_5)
diff --git a/src/hotspot/os/aix/libperfstat_aix.hpp b/src/hotspot/os/aix/libperfstat_aix.hpp
index f35f517b489f9..a984440e57976 100644
--- a/src/hotspot/os/aix/libperfstat_aix.hpp
+++ b/src/hotspot/os/aix/libperfstat_aix.hpp
@@ -332,7 +332,7 @@ typedef struct { /* component perfstat_cpu_t from AIX 7.2 documentation */
u_longlong_t busy_stolen_purr; /* Number of busy cycles stolen by the hypervisor from a dedicated partition. */
u_longlong_t busy_stolen_spurr; /* Number of busy spurr cycles stolen by the hypervisor from a dedicated partition.*/
u_longlong_t shcpus_in_sys; /* Number of physical processors allocated for shared processor use, across all shared processors pools. */
- u_longlong_t entitled_pool_capacity; /* Entitled processor capacity of partition’s pool. */
+ u_longlong_t entitled_pool_capacity; /* Entitled processor capacity of partition's pool. */
u_longlong_t pool_max_time; /* Summation of maximum time that can be consumed by the pool (nanoseconds). */
u_longlong_t pool_busy_time; /* Summation of busy (nonidle) time accumulated across all partitions in the pool (nanoseconds). */
u_longlong_t pool_scaled_busy_time; /* Scaled summation of busy (nonidle) time accumulated across all partitions in the pool (nanoseconds). */
diff --git a/src/hotspot/os/aix/loadlib_aix.cpp b/src/hotspot/os/aix/loadlib_aix.cpp
index 90a7271ad6d38..e7dbd775e3707 100644
--- a/src/hotspot/os/aix/loadlib_aix.cpp
+++ b/src/hotspot/os/aix/loadlib_aix.cpp
@@ -38,6 +38,7 @@
#include "logging/log.hpp"
#include "utilities/debug.hpp"
#include "utilities/ostream.hpp"
+#include "utilities/permitForbiddenFunctions.hpp"
// For loadquery()
#include
@@ -58,7 +59,7 @@ class StringList {
// Enlarge list. If oom, leave old list intact and return false.
bool enlarge() {
int cap2 = _cap + 64;
- char** l2 = (char**) ::realloc(_list, sizeof(char*) * cap2);
+ char** l2 = (char**) permit_forbidden_function::realloc(_list, sizeof(char*) * cap2);
if (!l2) {
return false;
}
@@ -76,7 +77,7 @@ class StringList {
}
}
assert0(_cap > _num);
- char* s2 = ::strdup(s);
+ char* s2 = permit_forbidden_function::strdup(s);
if (!s2) {
return nullptr;
}
@@ -170,7 +171,7 @@ static void free_entry_list(loaded_module_t** start) {
loaded_module_t* lm = *start;
while (lm) {
loaded_module_t* const lm2 = lm->next;
- ::free(lm);
+ permit_forbidden_function::free(lm);
lm = lm2;
}
*start = nullptr;
@@ -193,7 +194,7 @@ static bool reload_table() {
uint8_t* buffer = nullptr;
size_t buflen = 1024;
for (;;) {
- buffer = (uint8_t*) ::realloc(buffer, buflen);
+ buffer = (uint8_t*) permit_forbidden_function::realloc(buffer, buflen);
if (loadquery(L_GETINFO, buffer, buflen) == -1) {
if (errno == ENOMEM) {
buflen *= 2;
@@ -229,7 +230,7 @@ static bool reload_table() {
for (;;) {
- loaded_module_t* lm = (loaded_module_t*) ::malloc(sizeof(loaded_module_t));
+ loaded_module_t* lm = (loaded_module_t*) permit_forbidden_function::malloc(sizeof(loaded_module_t));
if (!lm) {
log_warning(os)("OOM.");
goto cleanup;
@@ -250,7 +251,7 @@ static bool reload_table() {
if (!lm->path) {
log_warning(os)("OOM.");
- free(lm);
+ permit_forbidden_function::free(lm);
goto cleanup;
}
@@ -272,7 +273,7 @@ static bool reload_table() {
lm->member = g_stringlist.add(p_mbr_name);
if (!lm->member) {
log_warning(os)("OOM.");
- free(lm);
+ permit_forbidden_function::free(lm);
goto cleanup;
}
} else {
@@ -320,7 +321,7 @@ static bool reload_table() {
free_entry_list(&new_list);
}
- ::free(buffer);
+ permit_forbidden_function::free(buffer);
return rc;
diff --git a/src/hotspot/os/aix/os_aix.cpp b/src/hotspot/os/aix/os_aix.cpp
index aee15e4c55a5e..1bbaf29125d7a 100644
--- a/src/hotspot/os/aix/os_aix.cpp
+++ b/src/hotspot/os/aix/os_aix.cpp
@@ -73,6 +73,7 @@
#include "utilities/defaultStream.hpp"
#include "utilities/events.hpp"
#include "utilities/growableArray.hpp"
+#include "utilities/permitForbiddenFunctions.hpp"
#include "utilities/vmError.hpp"
#if INCLUDE_JFR
#include "jfr/support/jfrNativeLibraryLoadEvent.hpp"
@@ -131,8 +132,6 @@ extern "C" int getargs(procsinfo*, int, char*, int);
#define MAX_PATH (2 * K)
-// for timer info max values which include all bits
-#define ALL_64_BITS CONST64(0xFFFFFFFFFFFFFFFF)
// for multipage initialization error analysis (in 'g_multipage_error')
#define ERROR_MP_OS_TOO_OLD 100
#define ERROR_MP_EXTSHM_ACTIVE 101
@@ -369,9 +368,9 @@ static void query_multipage_support() {
// or by environment variable LDR_CNTRL (suboption DATAPSIZE). If none is given,
// default should be 4K.
{
- void* p = ::malloc(16*M);
+ void* p = permit_forbidden_function::malloc(16*M);
g_multipage_support.datapsize = os::Aix::query_pagesize(p);
- ::free(p);
+ permit_forbidden_function::free(p);
}
// Query default shm page size (LDR_CNTRL SHMPSIZE).
@@ -905,7 +904,7 @@ jlong os::javaTimeNanos() {
}
void os::javaTimeNanos_info(jvmtiTimerInfo *info_ptr) {
- info_ptr->max_value = ALL_64_BITS;
+ info_ptr->max_value = all_bits_jlong;
// mread_real_time() is monotonic (see 'os::javaTimeNanos()')
info_ptr->may_skip_backward = false;
info_ptr->may_skip_forward = false;
@@ -1306,60 +1305,39 @@ void os::jvm_path(char *buf, jint buflen) {
char* rp = os::realpath((char *)dlinfo.dli_fname, buf, buflen);
assert(rp != nullptr, "error in realpath(): maybe the 'path' argument is too long?");
- if (Arguments::sun_java_launcher_is_altjvm()) {
- // Support for the java launcher's '-XXaltjvm=' option. Typical
- // value for buf is "