python tensorflow neural-network physics differential-equations

TensorFlow second derivative calculation always returns zero in Quantum PINN solver

I'm attempting to implement a simple Physics-Informed Neural Network (PINN) for Solving the time-independent Schrödinger equation given an arbitrary scalar potential. The details aren't, I think, terrible relevant. I'm using TensorFlow 2.x and encountering an issue where the second derivative calculation always returns None, irrespective of the potential function used. Here's a simplified version of my code which exhibits the issue:

import tensorflow as tf

class QuantumPINNSolver:
    def __init__(self, potential_func):
        self.potential_func = potential_func
        self.model = self.build_model()

    def build_model(self):
        return tf.keras.Sequential([
            tf.keras.layers.Dense(50, activation='relu', input_shape=(1,), dtype=tf.float64),
            tf.keras.layers.Dense(50, activation='relu', dtype=tf.float64),
            tf.keras.layers.Dense(50, activation='relu', dtype=tf.float64),
            tf.keras.layers.Dense(1, activation=None, dtype=tf.float64)
        ])

    @tf.function
    def schrodinger_loss(self, x, E):
        with tf.GradientTape(persistent=True) as tape2:
            with tf.GradientTape(persistent=True) as tape1:
                tape1.watch(x)
                psi = self.model(x)
            dpsi_dx = tape1.gradient(psi, x)
        d2psi_dx2 = tape2.gradient(dpsi_dx, x)
        
        print("psi:", psi)
        print("dpsi_dx:", dpsi_dx)
        print("d2psi_dx2:", d2psi_dx2)  # This is always zero
        
        V_x = self.potential_func(x)
        schrodinger_eq = -d2psi_dx2 + (V_x - E) * psi
        return tf.reduce_mean(tf.square(schrodinger_eq))

# Usage
def harmonic_oscillator_potential(x):
    return 0.5 * tf.square(x)

solver = QuantumPINNSolver(potential_func=harmonic_oscillator_potential)
x = tf.random.uniform((100, 1), minval=-5, maxval=5, dtype=tf.float64)
E = tf.constant(0.5, dtype=tf.float64)
loss = solver.schrodinger_loss(x, E)

The issue is that d2psi_dx2 is always None, even though psi and dpsi_dx have non-zero values. I've tried:

Using different potential functions
Changing the model architecture
Using different activation functions
Adjusting the input range and dtype

...but the second derivative remains None. What could be causing this, and how can I fix it to properly calculate the second derivative? I'm not terribly familiar with TensorFlow (and I'm still trying to wrap my head around GradientTape), and I expect that this is some simple issue I haven't uncovered by reading the documentation.

Solution

Figured it out!

TensorFlow's GradientTape is designed to compute gradients of a scalar output with respect to the variables being watched. However, evidently when you attempt to compute higher-order derivatives (like second derivatives), TensorFlow requires explicit handling.

In the original code, the second gradient (d2psi_dx2) was returning None because TensorFlow's gradient tracking doesn't automatically propagate gradients through nested tapes for higher-order derivatives.

To fix this issue and correctly compute the second derivative, you need to explicitly manage the gradient calculation using separate GradientTape contexts:

Use Multiple GradientTape Contexts: Use one GradientTape to compute psi and another nested GradientTape to compute dpsi_dx.
Persistent Tapes: Ensure both GradientTape contexts are declared as persistent (persistent=True) so that you can compute gradients multiple times within them.
Explicit Derivative Calculation: After computing the first derivative (dpsi_dx), use the outer GradientTape context to compute the second derivative (d2psi_dx2) with respect to the input x.

def schrodinger_loss(self, x, E):
        with tf.GradientTape(persistent=True) as tape:
            tape.watch(x)
            with tf.GradientTape(persistent=True) as tape1:
                tape1.watch(x)
                psi = self.model(x)
            dpsi_dx = tape1.gradient(psi, x)
        
        d2psi_dx2 = tape.gradient(dpsi_dx, x)
        
        V_x = self.potential_func(x)
        schrodinger_eq = -d2psi_dx2 + (V_x - E) * psi
        return tf.reduce_mean(tf.square(schrodinger_eq))