I'm attempting to implement a simple Physics-Informed Neural Network (PINN) for Solving the time-independent Schrödinger equation given an arbitrary scalar potential. The details aren't, I think, terrible relevant. I'm using TensorFlow 2.x and encountering an issue where the second derivative calculation always returns None
, irrespective of the potential function used. Here's a simplified version of my code which exhibits the issue:
import tensorflow as tf
class QuantumPINNSolver:
def __init__(self, potential_func):
self.potential_func = potential_func
self.model = self.build_model()
def build_model(self):
return tf.keras.Sequential([
tf.keras.layers.Dense(50, activation='relu', input_shape=(1,), dtype=tf.float64),
tf.keras.layers.Dense(50, activation='relu', dtype=tf.float64),
tf.keras.layers.Dense(50, activation='relu', dtype=tf.float64),
tf.keras.layers.Dense(1, activation=None, dtype=tf.float64)
])
@tf.function
def schrodinger_loss(self, x, E):
with tf.GradientTape(persistent=True) as tape2:
with tf.GradientTape(persistent=True) as tape1:
tape1.watch(x)
psi = self.model(x)
dpsi_dx = tape1.gradient(psi, x)
d2psi_dx2 = tape2.gradient(dpsi_dx, x)
print("psi:", psi)
print("dpsi_dx:", dpsi_dx)
print("d2psi_dx2:", d2psi_dx2) # This is always zero
V_x = self.potential_func(x)
schrodinger_eq = -d2psi_dx2 + (V_x - E) * psi
return tf.reduce_mean(tf.square(schrodinger_eq))
# Usage
def harmonic_oscillator_potential(x):
return 0.5 * tf.square(x)
solver = QuantumPINNSolver(potential_func=harmonic_oscillator_potential)
x = tf.random.uniform((100, 1), minval=-5, maxval=5, dtype=tf.float64)
E = tf.constant(0.5, dtype=tf.float64)
loss = solver.schrodinger_loss(x, E)
The issue is that d2psi_dx2
is always None
, even though psi
and dpsi_dx
have non-zero values. I've tried:
dtype
...but the second derivative remains None
. What could be causing this, and how can I fix it to properly calculate the second derivative? I'm not terribly familiar with TensorFlow
(and I'm still trying to wrap my head around GradientTape
), and I expect that this is some simple issue I haven't uncovered by reading the documentation.
Figured it out!
TensorFlow's GradientTape
is designed to compute gradients of a scalar output with respect to the variables being watched. However, evidently when you attempt to compute higher-order derivatives (like second derivatives), TensorFlow requires explicit handling.
In the original code, the second gradient (d2psi_dx2
) was returning None because TensorFlow's gradient tracking doesn't automatically propagate gradients through nested tapes for higher-order derivatives.
To fix this issue and correctly compute the second derivative, you need to explicitly manage the gradient calculation using separate GradientTape contexts:
Use Multiple GradientTape Contexts: Use one GradientTape
to compute psi and another nested GradientTape to compute dpsi_dx
.
Persistent Tapes: Ensure both GradientTape
contexts are declared as persistent (persistent=True)
so that you can compute gradients multiple times within them.
Explicit Derivative Calculation: After computing the first derivative (dpsi_dx
), use the outer GradientTape
context to compute the second derivative (d2psi_dx2
) with respect to the input x
.
def schrodinger_loss(self, x, E):
with tf.GradientTape(persistent=True) as tape:
tape.watch(x)
with tf.GradientTape(persistent=True) as tape1:
tape1.watch(x)
psi = self.model(x)
dpsi_dx = tape1.gradient(psi, x)
d2psi_dx2 = tape.gradient(dpsi_dx, x)
V_x = self.potential_func(x)
schrodinger_eq = -d2psi_dx2 + (V_x - E) * psi
return tf.reduce_mean(tf.square(schrodinger_eq))