Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions avx512_build_setup/ADDITIONAL_README_FOR_AVX512_Version.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
Notes for AVX512 Windows build
==============================

The visual studio version was updated from vs2015 to vs2022 to enable support of AVX512 Version
1. Set the configuration in solution file to Release-AVX512 | x64
2. Select the 'Generic POV-Ray > povbase' project and expand 'Backend Headers', then open the
file `build.h`(source/base/build.h) listed within it.In it replace with name
and email of person who builds the code in `BUILT_BY` flag and comment the #error directive (line 129)
3. In syspovconfig.h(windows/povconfig/syspovconfig.h) uncomment the #define _CONSOLE. (line 56)
The AVX512 version was developed with the console version.
The GUI build has been skipped in the solution file.
**Note:** (Presently with the updated code the GUI project is skipped for building,
as the cmedit64.dll and povcmax64.dll from official windows distribution are
incompatible with VS2022. The console version alone is available to build and test).
4. Build the solution file and in the vs2022/bin64 folder we can run the POVRAY examples with povconsole-avx512.exe.
```
General command example - povconsole-avx512.exe +Ibenchmark.pov
Single worker thread - povconsole-avx512.exe +WT1 benchmark.pov
Output image - benchmark.png
```
5. Results with the AVX512 version has been attached in the same folder.

Notes for UNIX build
====================

Dependencies for unix build
```
libboost-dev
libboost-date-time-dev
libboost-thread-dev
libz-dev
libpng-dev
libjpeg-dev
libtiff-dev
libopenexr-dev
pkg-config (if its already not there)
```

Steps :
Generating configure and building the code :
```
% cd unix/
% ./prebuild.sh
% cd ../
% ./configure COMPILED_BY="your name <email@address>"
% make
```

To build with icpc :
```
% source /opt/intel/oneapi/setvars.sh
% cd unix/
% ./prebuild.sh
% cd ../
% ./configure COMPILED_BY="your name <email@address>" CXX=icpc
% make
```

Sample commands (inside the unix folder) :
```
General command example - ./povray +Ibenchmark.pov
Single worker thread - ./povray +WT1 benchmark.pov
Output image - benchmark.png
```

Binary file not shown.
1,435 changes: 1,435 additions & 0 deletions platform/x86/avx512/avx512noise.cpp

Large diffs are not rendered by default.

73 changes: 73 additions & 0 deletions platform/x86/avx512/avx512noise.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
//******************************************************************************
///
/// @file platform/x86/avx512/avx5123noise.h
///
/// This file contains declarations related to implementations of the noise
/// generator optimized for the AVX512 instruction set.
///
/// @copyright
/// @parblock
///
/// Persistence of Vision Ray Tracer ('POV-Ray') version 3.8.
/// Copyright 1991-2017 Persistence of Vision Raytracer Pty. Ltd.
///
/// POV-Ray is free software: you can redistribute it and/or modify
/// it under the terms of the GNU Affero General Public License as
/// published by the Free Software Foundation, either version 3 of the
/// License, or (at your option) any later version.
///
/// POV-Ray is distributed in the hope that it will be useful,
/// but WITHOUT ANY WARRANTY; without even the implied warranty of
/// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
/// GNU Affero General Public License for more details.
///
/// You should have received a copy of the GNU Affero General Public License
/// along with this program. If not, see <http://www.gnu.org/licenses/>.
///
/// ----------------------------------------------------------------------------
///
/// POV-Ray is based on the popular DKB raytracer version 2.12.
/// DKBTrace was originally written by David K. Buck.
/// DKBTrace Ver 2.0-2.12 were written by David K. Buck & Aaron A. Collins.
///
/// @endparblock
///
//******************************************************************************

#ifndef POVRAY_AVX512NOISE_H
#define POVRAY_AVX512NOISE_H

#include "core/configcore.h"
#include "core/math/vector.h"

#ifdef TRY_OPTIMIZED_NOISE_AVX512

namespace pov
{

extern const bool kAVX512NoiseEnabled;
void AVX512NoiseInit();

/// Optimized Noise function for single input for AVX512 architecture
DBL AVX512Noise(const Vector3d& EPoint, int noise_generator);

/// Optimized DNoise function for single input for AVX512 architecture
void AVX512DNoise(Vector3d& result, const Vector3d& EPoint);

/// Optimized Noise function for two inputs using AVX512 instructions
/// @author Optimized by MCW
void AVX512Noise2D(const Vector3d& EPoint, int noise_generator, double& value);

/// Optimized DNoise function for two inputs using AVX512 instructions
/// @author Optimized by MCW
void AVX512DNoise2D(Vector3d& result, const Vector3d& EPoint);

/// Optimized Noise function for 8 multiples of single input using AVX512 instructions.
/// @author Optimized by MCW
DBL AVX512Noise8D(const Vector3d& EPoint, int noise_generator);

}

#endif // TRY_OPTIMIZED_NOISE_AVX512

#endif // POVRAY_AVX512NOISE_H
25 changes: 25 additions & 0 deletions platform/x86/cpuid.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,7 @@ static unsigned long long getXCR0()
#define CPUID_00000001_ECX_AVX_MASK (0x1 << 28)
#define CPUID_00000001_EDX_SSE2_MASK (0x1 << 26)
#define CPUID_00000007_EBX_AVX2_MASK (0x1 << 5)
#define CPUID_00000007_EBX_AVX512_MASK (0x1 << 16)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two constants with the same value? Intentional or by mistake?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both FMA4 and AVX512 constants are indeed set to 1 << 16. However, for cpuid information comparison to detect machine architecture information, AVX512 is using data from the EBX register

avx512 = ((info[CPUID_EBX] & CPUID_00000007_EBX_AVX512_MASK) != 0);
and FMA4 is using data from ECX register
fma4 = ((info[CPUID_ECX] & CPUID_80000001_ECX_FMA4_MASK) != 0);

#define CPUID_80000001_ECX_FMA4_MASK (0x1 << 16)

// Masks for relevant XCR0 register bits.
Expand Down Expand Up @@ -170,6 +171,7 @@ struct CPUIDInfo
bool sse2 : 1;
bool avx : 1;
bool avx2 : 1;
bool avx512 : 1;
bool fma3 : 1;
bool fma4 : 1;
#if POV_CPUINFO_DEBUG
Expand All @@ -184,6 +186,7 @@ CPUIDInfo::CPUIDInfo() :
sse2(false),
avx(false),
avx2(false),
avx512(false),
fma3(false),
fma4(false),
vendorId(kCPUVendor_Unrecognized)
Expand Down Expand Up @@ -220,6 +223,11 @@ CPUIDInfo::CPUIDInfo() :
CPUID(info, 0x7);
avx2 = ((info[CPUID_EBX] & CPUID_00000007_EBX_AVX2_MASK) != 0);
}
if (maxLeaf >= 0x7)
{
CPUID(info, 0x7);
avx512 = ((info[CPUID_EBX] & CPUID_00000007_EBX_AVX512_MASK) != 0);
}
CPUID(info, 0x80000000);
int maxLeafExt = info[CPUID_EAX];
if (maxLeafExt >= (int)0x80000001)
Expand All @@ -233,6 +241,7 @@ struct OSInfo
{
bool xcr0_sse : 1;
bool xcr0_avx : 1;
bool xcr0_avx512 : 1;
OSInfo(const CPUIDInfo& cpuinfo);
};

Expand Down Expand Up @@ -278,6 +287,16 @@ bool CPUInfo::SupportsAVX()
&& gpData->osInfo.xcr0_avx;
}

bool CPUInfo::SupportsAVX512()
{
return gpData->cpuidInfo.osxsave
&& gpData->cpuidInfo.avx
&& gpData->cpuidInfo.avx2
&& gpData->cpuidInfo.avx512
&& gpData->osInfo.xcr0_sse
&& gpData->osInfo.xcr0_avx;
}

bool CPUInfo::SupportsAVX2()
{
return gpData->cpuidInfo.osxsave
Expand Down Expand Up @@ -329,6 +348,8 @@ std::string CPUInfo::GetFeatures()
features.push_back("AVX");
if (SupportsAVX2())
features.push_back("AVX2");
if (SupportsAVX512())
features.push_back("AVX512");
if (SupportsFMA3())
features.push_back("FMA3");
if (SupportsFMA4())
Expand Down Expand Up @@ -356,6 +377,8 @@ std::string CPUInfo::GetDetails()
cpuidFeatures.push_back("AVX");
if (gpData->cpuidInfo.avx2)
cpuidFeatures.push_back("AVX2");
if (gpData->cpuidInfo.avx512)
cpuidFeatures.push_back("AVX512");
if (gpData->cpuidInfo.fma3)
cpuidFeatures.push_back("FMA");
if (gpData->cpuidInfo.fma4)
Expand All @@ -371,6 +394,8 @@ std::string CPUInfo::GetDetails()

if (gpData->osInfo.xcr0_avx)
xcr0Features.push_back("AVX");
if (gpData->osInfo.xcr0_avx)
xcr0Features.push_back("AVX512");
if (gpData->osInfo.xcr0_sse)
xcr0Features.push_back("SSE");

Expand Down
1 change: 1 addition & 0 deletions platform/x86/cpuid.h
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ class CPUInfo
static bool SupportsSSE2(); ///< Test whether CPU and OS support SSE2.
static bool SupportsAVX(); ///< Test whether CPU and OS support AVX.
static bool SupportsAVX2(); ///< Test whether CPU and OS support AVX2.
static bool SupportsAVX512(); ///< Test whether CPU and OS support AVX512.
static bool SupportsFMA3(); ///< Test whether CPU and OS support FMA3.
static bool SupportsFMA4(); ///< Test whether CPU and OS support FMA4.
static bool IsIntel(); ///< Test whether CPU is genuine Intel product.
Expand Down
57 changes: 57 additions & 0 deletions platform/x86/optimizednoise.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,10 @@

#include "core/material/noise.h"

#ifdef TRY_OPTIMIZED_NOISE_AVX512
#include "avx512/avx512noise.h"
#endif

#ifdef TRY_OPTIMIZED_NOISE_AVX2FMA3
#include "avx2fma3/avx2fma3noise.h"
#endif
Expand All @@ -65,19 +69,48 @@ namespace pov
static bool AVXSupported() { return CPUInfo::SupportsAVX(); }
static bool AVXFMA4Supported() { return CPUInfo::SupportsAVX() && CPUInfo::SupportsFMA4(); }
static bool AVX2FMA3Supported() { return CPUInfo::SupportsAVX2() && CPUInfo::SupportsFMA3(); }
static bool AVX512Supported() { return CPUInfo::SupportsAVX512(); }

/// List of optimized noise implementations.
///
/// @note
/// Entries must be listed in descending order of preference.
///
OptimizedNoiseInfo gaOptimizedNoiseInfo[] = {
#ifdef TRY_OPTIMIZED_NOISE_AVX512
{
"avx512-mcw", // name,
"hand-optimized by MCW", // info,
AVX512Noise, // noise,
AVX512DNoise, // dNoise,
AVX512Noise2D, // noise2D,
AVX512DNoise2D, // dNoise2D,
AVX512Noise8D, // noise8D,
DTurbulenceAVX512, // DTurbulence
Initialize_WavesAVX512, // Initalize Waves
TurbulenceAVX512, // Turbulence
wrinklesAVX512, // wrinkles
true, // value to set versions of WrinklesPattern and GranitePattern
&kAVX512NoiseEnabled, // enabled,
AVX512Supported, // supported,
nullptr, // recommended,
AVX512NoiseInit // init
},
#endif
#ifdef TRY_OPTIMIZED_NOISE_AVX2FMA3
{
"avx2fma3-intel", // name,
"hand-optimized by Intel", // info,
AVX2FMA3Noise, // noise,
AVX2FMA3DNoise, // dNoise,
nullptr, // noise2D
nullptr, // dnoise2D,
nullptr, // noise8D,
DTurbulenceAVX, // DTurbulence
Initialize_WavesAVX, // Initalize Waves
TurbulenceAVX, // Turbulence
wrinklesAVX, // wrinkles
false, // value to set versions of WrinklesPattern and GranitePattern
&kAVX2FMA3NoiseEnabled, // enabled,
AVX2FMA3Supported, // supported,
CPUInfo::IsIntel, // recommended,
Expand All @@ -90,6 +123,14 @@ OptimizedNoiseInfo gaOptimizedNoiseInfo[] = {
"hand-optimized by AMD, 2017-04 update", // info,
AVXFMA4Noise, // noise,
AVXFMA4DNoise, // dNoise,
nullptr, // noise2D
nullptr, // dnoise2D,
nullptr, // noise8D,
DTurbulenceAVX, // DTurbulence
Initialize_WavesAVX, // Initalize Waves
TurbulenceAVX, // Turbulence
wrinklesAVX, // wrinkles
false, // value to set versions of WrinklesPattern and GranitePattern
&kAVXFMA4NoiseEnabled, // enabled,
AVXFMA4Supported, // supported,
nullptr, // recommended,
Expand All @@ -102,6 +143,14 @@ OptimizedNoiseInfo gaOptimizedNoiseInfo[] = {
"hand-optimized by Intel", // info,
AVXNoise, // noise,
AVXDNoise, // dNoise,
nullptr, // noise2D
nullptr, // dnoise2D,
nullptr, // noise8D,
DTurbulenceAVX, // DTurbulence
Initialize_WavesAVX, // Initalize Waves
TurbulenceAVX, // Turbulence
wrinklesAVX, // wrinkles
false, // value to set versions of WrinklesPattern and GranitePattern
&kAVXNoiseEnabled, // enabled,
AVXSupported, // supported,
CPUInfo::IsIntel, // recommended,
Expand All @@ -114,6 +163,14 @@ OptimizedNoiseInfo gaOptimizedNoiseInfo[] = {
"compiler-optimized", // info,
AVXPortableNoise, // noise,
AVXPortableDNoise, // dNoise,
nullptr, // noise2D
nullptr, // dnoise2D,
nullptr, // noise8D,
DTurbulenceAVX, // DTurbulence
Initialize_WavesAVX, // Initalize Waves
TurbulenceAVX, // Turbulence
wrinklesAVX, // wrinkles
false, // value to set versions of WrinklesPattern and GranitePattern
&kAVXPortableNoiseEnabled, // enabled,
AVXSupported, // supported,
nullptr, // recommended,
Expand Down
Loading