javacassandracassandra-3.0

Cassandra suddenly hangs, returns Windows FileSystemException: "The process is inaccessible because the file is in use by another process"


I am running a product.

I'm using these versions:

The server specification is 128GB of total memory. I'm using 24GB. and WindowServer 2019 Standard.
In addition, "nodetool flush" runs every four hours.

If you look at the Cassandra log below, you can see that the work is continuing until the time is 14:47:05.865.

[ 00:47:06.117 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 00:47:06,116 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
[ 01:47:06.098 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 01:47:06,097 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
[ 02:46:58.241 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [CompactionExecutor:9301] 2023-09-04 02:46:58,240 AutoSavingCache.java:395 - Saved KeyCache (527 items) in 7 ms ] 
[ 02:47:06.080 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 02:47:06,080 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
[ 03:47:06.062 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 03:47:06,061 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
[ 04:47:06.044 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 04:47:06,043 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
[ 05:47:06.026 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 05:47:06,025 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
[ 06:46:58.152 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [CompactionExecutor:9391] 2023-09-04 06:46:58,151 AutoSavingCache.java:395 - Saved KeyCache (527 items) in 7 ms ] 
[ 06:47:06.008 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 06:47:06,008 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
[ 07:47:05.990 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 07:47:05,989 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
[ 08:47:05.972 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 08:47:05,971 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
[ 09:47:05.953 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 09:47:05,952 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
[ 10:46:58.065 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [CompactionExecutor:9480] 2023-09-04 10:46:58,064 AutoSavingCache.java:395 - Saved KeyCache (527 items) in 7 ms ] 
[ 10:47:05.936 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 10:47:05,936 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
[ 11:47:05.918 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 11:47:05,917 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
[ 12:47:05.901 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 12:47:05,900 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
[ 13:47:05.884 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 13:47:05,883 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
[ 14:46:57.977 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [CompactionExecutor:9568] 2023-09-04 14:46:57,976 AutoSavingCache.java:395 - Saved KeyCache (527 items) in 7 ms ] 
[ 14:47:05.865 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 14:47:05,865 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
[ 15:21:18.913 ] [ [34mINFO [0;39m ] [ [36m.c.EmbeddedCassandra[0;39m ] [ stop                 ] [ 85  ] [ Stops EmbeddedCassandra[name='cassandra-1', version='3.11.7'] ] 
[ 15:21:21.915 ] [ [31mWARN [0;39m ] [ [36mWindowsCassandraNode[0;39m ] [ stop                 ] [ 111 ] [ java.lang.Process.destroyForcibly() has been called for 'WindowsCassandraNode[pid='-1', exitValue='not exited']'. The behavior of this method is undefined, hence Cassandra's node could be still alive ] 
[ 15:21:21.916 ] [ [34mINFO [0;39m ] [ [36mdedCassandraDatabase[0;39m ] [ stop                 ] [ 131 ] [ EmbeddedCassandraDatabase[name='cassandra-1', version='3.11.7', node=WindowsCassandraNode[pid='-1', exitValue='1']] has been stopped ] 
[ 15:21:21.918 ] [ [1;31mERROR[0;39m ] [ [36mdedCassandraDatabase[0;39m ] [ stop                 ] [ 137 ] [ Working Directory 'C:\Windows\TEMP\apache-cassandra-3.11.7-3621485632518011638' has not been deleted ] 
java.nio.file.FileSystemException: C:\Windows\TEMP\apache-cassandra-3.11.7-3621485632518011638\.toDelete: The process is inaccessible because the file is in use by another process.

    at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86)
    at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
    at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102)
    at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269)
    at sun.nio.fs.AbstractFileSystemProvider.deleteIfExists(AbstractFileSystemProvider.java:108)
    at java.nio.file.Files.deleteIfExists(Files.java:1165)
    at com.github.nosan.embedded.cassandra.commons.util.FileUtils$1.visitFile(FileUtils.java:82)
    at com.github.nosan.embedded.cassandra.commons.util.FileUtils$1.visitFile(FileUtils.java:78)
    at java.nio.file.Files.walkFileTree(Files.java:2670)
    at java.nio.file.Files.walkFileTree(Files.java:2742)
    at com.github.nosan.embedded.cassandra.commons.util.FileUtils.delete(FileUtils.java:78)
    at com.github.nosan.embedded.cassandra.EmbeddedCassandraDatabase.stop(EmbeddedCassandraDatabase.java:134)
    at com.github.nosan.embedded.cassandra.EmbeddedCassandra.doStop(EmbeddedCassandra.java:157)
    at com.github.nosan.embedded.cassandra.EmbeddedCassandra.stop(EmbeddedCassandra.java:86)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.springframework.beans.factory.support.DisposableBeanAdapter.invokeCustomDestroyMethod(DisposableBeanAdapter.java:339)
    at org.springframework.beans.factory.support.DisposableBeanAdapter.destroy(DisposableBeanAdapter.java:273)
    at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroyBean(DefaultSingletonBeanRegistry.java:571)
    at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingleton(DefaultSingletonBeanRegistry.java:543)
    at org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingleton(DefaultListableBeanFactory.java:1072)
    at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingletons(DefaultSingletonBeanRegistry.java:504)
    at org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingletons(DefaultListableBeanFactory.java:1065)
    at org.springframework.context.support.AbstractApplicationContext.destroyBeans(AbstractApplicationContext.java:1060)
    at org.springframework.context.support.AbstractApplicationContext.doClose(AbstractApplicationContext.java:1029)
    at org.springframework.context.support.AbstractApplicationContext.close(AbstractApplicationContext.java:978)
    at org.springframework.boot.SpringApplication.close(SpringApplication.java:1284)
    at org.springframework.boot.SpringApplication.exit(SpringApplication.java:1271)
    at kr.co.wisenut.manager.sf1.SearchManagerApplication.main(SearchManagerApplication.java:72)
[ 15:21:21.920 ] [ [34mINFO [0;39m ] [ [36m.c.EmbeddedCassandra[0;39m ] [ stop                 ] [ 87  ] [ EmbeddedCassandra[name='cassandra-1', version='3.11.7'] has been stopped ]

However, if I look at the application log I'm running, I can see that I can't access Cassandra from around 08:44:48.596.

[ 08:44:48.596 ] [ [1;31mERROR[0;39m ] [ [36mc.e.ExceptionHandler[0;39m ] [ renderErrorResponse  ] [ 73  ] [ Query; CQL [SELECT * FROM manager_common_user;]; All host(s) tried for query failed (no host was tried); nested exception is com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (no host was tried) ] 
[ 08:45:37.999 ] [ [1;31mERROR[0;39m ] [ [36mc.e.ExceptionHandler[0;39m ] [ renderErrorResponse  ] [ 73  ] [ Query; CQL [SELECT * FROM manager_common_user;]; All host(s) tried for query failed (no host was tried); nested exception is com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (no host was tried) ] 
[ 15:18:56.148 ] [ [1;31mERROR[0;39m ] [ [36mc.e.ExceptionHandler[0;39m ] [ renderErrorResponse  ] [ 73  ] [ Query; CQL [SELECT * FROM manager_common_user;]; All host(s) tried for query failed (no host was tried); nested exception is com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (no host was tried) ] 
[ 15:21:14.880 ] [ [34mINFO [0;39m ] [ [36m.CustomTaskScheduler[0;39m ] [ shutdown             ] [ 208 ] [ Shutting down ExecutorService 'customTaskScheduler' ]

and, disabled or hang issue is log below related?

[ 15:21:21.918 ] [ [1;31mERROR[0;39m ] [ [36mdedCassandraDatabase[0;39m ] [ stop                 ] [ 137 ] [ Working Directory 'C:\Windows\TEMP\apache-cassandra-3.11.7-3621485632518011638' has not been deleted ] 
java.nio.file.FileSystemException: C:\Windows\TEMP\apache-cassandra-3.11.7-3621485632518011638\.toDelete: The process is inaccessible because the file is in use by another process.

In the middle, there was no error log left in the Cassandra log.
Please help me with what part I'm missing.


Solution

  • Based on the log entries you posted, it appears that there is an issue at the filesystem level where one of the temporary files is locked by a process so it can't be accessed by another.

    The Java file I/O library that Cassandra uses has always been problematic on Windows particularly with NTFS. It has been a constant source of issues for users that we decided it wasn't good for the project to continue support (see the discussion in this thread).

    We eventually dropped Windows support completely in Cassandra 4.0 (CASSANDRA-16171).

    The recommended workarounds for Windows users are: