i have a server instance here with 4 Cores and 32 GB RAM and Ubuntu 20.04.3 LTS installed. On this machine there is an opengrok-instance running as docker container.
Inside of the docker container it uses AdoptOpenJDK:
OpenJDK Runtime Environment AdoptOpenJDK-11.0.11+9 (build 11.0.11+9)
Eclipse OpenJ9 VM AdoptOpenJDK-11.0.11+9 (build openj9-0.26.0, JRE 11 Linux amd64-64-Bit Compressed References 20210421_975 (JIT enabled, AOT enabled)
OpenJ9 - b4cc246d9
OMR - 162e6f729
JCL - 7796c80419 based on jdk-11.0.11+9)
The code-base that the opengrok-indexer scans is 320 GB big and tooks 21 hours.
What i am figured is out was, that i've am disable the history-option it tooks lesser time. Is there a possibility to reduce this time, if the history-flag is set.
Here are my index-command:
opengrok-indexer -J=-Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -J=-Djava.util.logging.config.file=/usr/share/tomcat10/conf/logging.properties -J=-XX:-UseGCOverheadLimit -J=-Xmx30G -J=-Xms30G -J=-server -a /var/opengrok/dist/lib/opengrok.jar -- -R /var/opengrok/etc/read-only.xml -m 256 -c /usr/bin/ctags -s /var/opengrok/src/ -d /var/opengrok/data --remote on -H -P -S -G -W /var/opengrok/etc/configuration.xml --progress -v -O on -T 3 --assignTags --search --remote on -i *.so -i *.o -i *.a -i *.class -i *.jar -i *.apk -i *.tar -i *.bz2 -i *.gz -i *.obj -i *.zip"
Thank you for your help in advance.
Kind Regards
Siegfried
You should try to increase the number of threads using the following options:
--historyThreads number
The number of threads to use for history cache generation on repository level. By default the number of threads will be set to the number of available CPUs.
Assumes -H/--history.
--historyFileThreads number
The number of threads to use for history cache generation when dealing with individual files.
By default the number of threads will be set to the number of available CPUs.
Assumes -H/--history.
-T, --threads number
The number of threads to use for index generation, repository scan
and repository invalidation.
By default the number of threads will be set to the number of available
CPUs. This influences the number of spawned ctags processes as well.
Take a look at the "renamedHistory" option too. Theoretically "off" is the default option but this has a huge impact on the index time, so it's worth the check:
--renamedHistory on|off
Enable or disable generating history for renamed files.
If set to on, makes history indexing slower for repositories
with lots of renamed files. Default is off.