Skip to content

Conversation

@Technici4n
Copy link
Collaborator

The goal of this PR is to make the ultrasoft SCF work. Further PRs can implement US extensions for interatomic forces, DFPT, etc... This should also make implementing PAW easier...

Currently I get 8e-4 Ry agreement with QE 7.4.1 on 4x4x4 Silicon from pslibrary 1.0.0 (the SSSP Efficiency one). Maybe QE 7.5 with the integration changes will even improve this?

The flip side is that I had to hardcode the specific pseudopotential in a few places... Definitely needs a ton of cleanup. 😄

Some points to discuss:

  • I hacked in reading the PP_AUGMENTATION data. PseudoPotentialIO 0.1 cannot read PP_AUGMENTATION data. And PseudoPotentialIO 0.2 seems to do more than just IO - some of its dependencies are not compatible with DFTK's. @azadoks
  • How to store the overlap in the basis/hamiltonian?
  • Is compute_density the right place to add the augmentation charge?
  • How to deal with the Hamiltonian contribution of the augmentation charges?
  • We need two FFT grids to get real speedup from US. The orbitals are a lot softer than the density hence they should use a smaller FFT. Matching two FFT grids of different max frequencies is easily done in Fourier space.
  • We need some US pseudopotentials in PseudoPotentialData. Somewhat related: supersampling is awkward with US - specifying a density cutoff would be more natural.
  • Performance is not great... yet! 😄

A few references for USPPs that were very helpful for me:

@mfherbst
Copy link
Member

The secret is revealed 🎉

Copy link
Member

@antoine-levitt antoine-levitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks a lot for tackling this! Some minor comments skimming through the code.

I thought US needed an overlap matrix, how are you comparing accuracy wrt QE without this?

We need two FFT grids to get real speedup from US. The orbitals are a lot softer than the density hence they should use a smaller FFT. Matching two FFT grids of different max frequencies is easily done in Fourier space.

So this is another level of supersampling then? You're saying we need one basis set to represent orbitals, one FFT grid to do hamiltonian applies (potential-orbital), and one FFT grid to compute the density/potential? (right now the two FFT grids are the same)

# TODO: check this
function unpack_lm(lm::Integer)
@assert lm 1
l = floor(Int, sqrt(lm - 1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isqrt?

l = floor(Int, sqrt(lm - 1))
m = lm - (l^2 + l + 1)
return l, m
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

define in the same vein a nharm(l) function or something like that (to have at all the same place the details of the isomorphism from (l,m) to lm)

# Sample points randomly on a sphere
rng = Xoshiro(42)
points = rand(rng, T, 3, nharm)
points ./= sqrt.(sum(points.^2, dims=1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have a columnwise_norm function now I think

return l, m
end

function gaunt_coefficients(T::Type, l::Integer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This algorithm is weird (but I know nothing about spherical harmonics), is there no deterministic way to get those? Add a ref at least.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inspired by what QE does (it's probably the exact same thing?). The idea is that since spherical harmonics are a basis (of functions), randomly evaluating them should also give a basis (of finite dimensional vectors). But with finite dimensional vectors it's just a simple matrix inversion to recover the basis coefficients.

We could also compute the coefficients directly but it's more technical (two Clebsch-Gordan coefficients - can be computed with WignerSymbols.jl but still annoying :/), especially with real spherical harmonics for which the formulas are apparently slightly different.

For sure this needs a bit of thought wrt. condition numbers, maybe adding a few extra points, etc... And certainly worth some unit tests as well.

Y[i, j] = ylm_real(unpack_lm(j)..., points[:, i])
end
end
fact = factorize(Y)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd possibly be concerned about the possibility of degeneracy here (ie the potentially bad conditioning), but maybe it's very rare. Can you make sure the rng you use is not subject to changes in future versions of julia? Then you can just run this function with various seeds for various l, and pick a seed such that the condition number is mild.

(Otherwise you can maybe get something more robust by doing a rectangular method (ie fitting the gaunt coefficients least squares, with M observations for N unknowns, M >= N))

end
ρ = sum(getfield.(storages, ))

# Hacky augmentation charges
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need some kind of extra_density mechanism to handle core correction and ultrasoft in the same framework

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(which could be part of the terms interface, so that densities.jl calls out to all the terms to ask them for their contribution to the density)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-linear core correction (NLCC) is exclusive to the exchange-correlation term, so it's currently implemented in the Xc term only (and was already done like this before this PR). (In some papers you really see it as a separate density that is added only in the XC functional like $E_{xc}[\rho + \rho_{nlcc}]$.)

You can really see US as separating local "hard" contributions to the density; and "soft" contributions from the orbitals. So the real physical charge density (as used for hartree, local potentials, xc) needs both the soft contribution from the orbitals and the hard contribution from the augmentation charges. But we might want the two parts of the density somewhat separate in case it makes sense for preconditioning or other cases.

@@ -0,0 +1,59 @@
function atom_local_grid(lattice::AbstractMatrix{T}, fft_grid::FFTGrid, position, radius) where {T}
# Find bounds on the points we have to check along each dimension
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks like computations we have to do for ewald and/or fft_size?

Copy link
Collaborator Author

@Technici4n Technici4n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't have a lot of time to continue this work for the next month or so, so we'll probably want to discuss how the code should be organized in the September meeting. (I'm also happy to give a short lecture on US psps to explain what the code needs to do.)

psp_positions = [model.positions[group] for group in psp_groups]

# We have an overlap term
B = map(ham.blocks) do block
Copy link
Collaborator Author

@Technici4n Technici4n Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@antoine-levitt this is my hacky implementation of the overlap term. It has the same structure as the nonlocal term (nonlocal is PDP'; overlap is I + PQP'), just with a different matrix in the center: replace D by Q.

(Right now this Q matrix comes straight from the PP_Q in the pseudopotential file, but we will likely want to integrate the discretized augmentation charges to ensure that we get the right number of electrons in the integrated density.)

Btw it's nice that LOBPCG "just works" with the overlap B. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, somehow I missed that.

end
ρ = sum(getfield.(storages, ))

# Hacky augmentation charges
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-linear core correction (NLCC) is exclusive to the exchange-correlation term, so it's currently implemented in the Xc term only (and was already done like this before this PR). (In some papers you really see it as a separate density that is added only in the XC functional like $E_{xc}[\rho + \rho_{nlcc}]$.)

You can really see US as separating local "hard" contributions to the density; and "soft" contributions from the orbitals. So the real physical charge density (as used for hartree, local potentials, xc) needs both the soft contribution from the orbitals and the hard contribution from the augmentation charges. But we might want the two parts of the density somewhat separate in case it makes sense for preconditioning or other cases.

return l, m
end

function gaunt_coefficients(T::Type, l::Integer)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inspired by what QE does (it's probably the exact same thing?). The idea is that since spherical harmonics are a basis (of functions), randomly evaluating them should also give a basis (of finite dimensional vectors). But with finite dimensional vectors it's just a simple matrix inversion to recover the basis coefficients.

We could also compute the coefficients directly but it's more technical (two Clebsch-Gordan coefficients - can be computed with WignerSymbols.jl but still annoying :/), especially with real spherical harmonics for which the formulas are apparently slightly different.

For sure this needs a bit of thought wrt. condition numbers, maybe adding a few extra points, etc... And certainly worth some unit tests as well.

@mfherbst mfherbst mentioned this pull request Jul 30, 2025
@Technici4n
Copy link
Collaborator Author

Hmmm, not sure I would trust the current implementation of augmentation charges. I might have implemented this less accurate flavor:
image

@antoine-levitt
Copy link
Member

I see this explains the question I had when you talked about it. I think we should be consistent and treat this like the projectors in norm-conserving pseudos.

@Technici4n
Copy link
Collaborator Author

I agree but that will require little Fourier boxes around each atom to keep the running time reasonable. Maybe not for this PR though 😄

@antoine-levitt
Copy link
Member

How do you mean? Can't we do the same as with norm-conserving, using hankel etc?

@Technici4n
Copy link
Collaborator Author

Ah I see what you mean. Indeed I think we can do the (relatively expensive) Fourier interpolation at setup for each augmentation channel, and only store the real-space result in small boxes around the atoms. (The augmentation charge should be 0 elsewhere).

@antoine-levitt
Copy link
Member

I actually meant to just not optimize anything and store the full thing. Compute the fourier coefficients of the periodized augmentation charges using hankel, and ifft on the full grid to get the augmentation charges. Or are you concerned that storing basically the density Natom times is too memory-intensive?

@mfherbst
Copy link
Member

density Natom times is too memory-intensive?

I think for larger-ish systems that can indeed become an issue. But that would be something I would argue we could optimise in a second pass and be simple and wasteful in the first implementation.

@azadoks
Copy link
Contributor

azadoks commented Sep 15, 2025

The goal of this PR is to make the ultrasoft SCF work. Further PRs can implement US extensions for interatomic forces, DFPT, etc... This should also make implementing PAW easier...

Some points to discuss:

* I hacked in reading the PP_AUGMENTATION data. PseudoPotentialIO 0.1 cannot read PP_AUGMENTATION data. And PseudoPotentialIO 0.2 seems to do more than just IO - some of its dependencies are not compatible with DFTK's. @azadoks

Great work! Looking forward to seeing this merged 😄 .

I spent a bit of time this weekend cutting out all the psp quantity evaluation code from PseudoPotentialIO; the result is in the io_only branch.
I'd like to deprecate the old Dict loaders and move towards returning structured data on load, but happy to take feedback on which is more convenient on the DFTK side.

Once we know that the io_only branch implements everything you need, I can release it as v0.3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants