reactjsgraphqlnext.jswp-graphql

WP GraphQL query only returns first 100 posts when generating sitemap


I am creating a dynamic sitemap and am trying to pull in all of the blog posts to include in the sitemap. The WP GraphQL Query in the GraphiQL IDE within WP shows all the posts, but when executing the code, it's only showing the first 100. I might be overlooking something but am not sure why this would be the case.

GraphQL Query:

export const GET_POSTS = gql`
  query GET_POSTS {
    posts(first: 10000) {
      nodes {
        title
        uri
        modified
      }
    }
  }
`;

Sitemap.xml

const Sitemap = () => {};

export const getServerSideProps = async ({ res }) => {
  const baseUrl = {
    development: "http://localhost:3000",
    production: "https://PRODURL.com",
  }[process.env.NODE_ENV];

  const staticPages = fs
    .readdirSync(
      {
        development: "pages",
        production: "./",
      }[process.env.NODE_ENV],
    )
    .filter((staticPage) => {
      return ![
        "_app.tsx",
        "[[...slug]].tsx",
        "_error.tsx",
        "sitemap.xml.tsx",
      ].includes(staticPage);
    })
    .map((staticPagePath) => {
      return `${baseUrl}/${staticPagePath}`;
    });

  const { data } = await client.query({
    query: GET_PAGES_SITEMAP,
  });

  const blogPosts = await client.query({
    query: GET_POSTS,
  });

  const pages = data?.pages.nodes || [];
  const posts = blogPosts.data.posts.nodes || [];

  const sitemap = `<?xml version="1.0" encoding="UTF-8"?>
  <urlset 
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
    xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" 
    xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" 
    xmlns:video="http://www.google.com/schemas/sitemap-video/1.1" 
    xmlns:news="http://www.google.com/schemas/sitemap-news/0.9" 
    xmlns:mobile="http://www.google.com/schemas/sitemap-mobile/1.0" 
    xmlns:pagemap="http://www.google.com/schemas/sitemap-pagemap/1.0" 
    xmlns:xhtml="http://www.w3.org/1999/xhtml" 
    xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
      ${staticPages
        .map((url) => {
          const date = new Date();
          const dateFormatter = Intl.DateTimeFormat("sv-SE");
          return `
            <url>
              <loc>${url}</loc>
              <lastmod>${dateFormatter.format(date)}</lastmod>
              <changefreq>monthly</changefreq>
              <priority>1.0</priority>
            </url>
          `;
        })
        .join("")}
      ${pages
        .map(({ uri, modified }) => {
          const date = new Date(modified);
          const dateFormatter = Intl.DateTimeFormat("sv-SE");

          return `
              <url>
                <loc>${baseUrl}${uri}</loc>
                <lastmod>${dateFormatter.format(date)}</lastmod>
                <changefreq>weekly</changefreq>
                <priority>1.0</priority>
              </url>
            `;
        })
        .join("")}
        ${posts
          .map(({ uri, modified }) => {
            const date = new Date(modified);
            const dateFormatter = Intl.DateTimeFormat("sv-SE");

            return `
              <url>
                <loc>${baseUrl}/blog${uri}</loc>
                <lastmod>${dateFormatter.format(date)}</lastmod>
                <changefreq>weekly</changefreq>
                <priority>1.0</priority>
              </url>
            `;
          })
          .join("")}
    </urlset>
  `;

  res.setHeader("Content-Type", "text/xml");
  res.write(sitemap);
  res.end();

  return {
    props: {},
  };
};

export default Sitemap;

Solution

  • By default, the maximum number of posts per page returned by WPGraphQL is 100. You can override this by increasing the graphql_connection_max_query_amount value.

    From the graphql_connection_max_query_amount filter documentation:

    Filter the maximum number of posts per page that should be queried. The default is 100 to prevent queries from being exceedingly resource intensive, however individual systems can override this for their specific needs. This filter is intentionally applied AFTER the query_args filter.