Tachyon is a distributed, in-memory storage system that is developed separately from Spark which could be used as an off-heap persistence storage during a Spark application
Tungsten is a new Spark SQL component that provides more efficient Spark operations by working directly at the byte level. Since Tungsten no longer depends on working with Java objects, we can use either on-heap (in the JVM) or off-heap storage
In off-heap mode, both reduces garbage collection overhead, since data is not stored as Java objects.
So could I simply consider Tachyon brings benefits to general RDD whereas spark-sql benefits from Tungsten ?
Suppose following code
val df = spark.range(10)
val rdd = df.rdd
df.persist(StorageLevel.OFF_HEAP) // in Tungsten format(bytes)?
df.show
rdd.persist(StorageLevel.OFF_HEAP) // in Tachyon storage ?
rdd.count
In short both yours statements are incorrect:
OFF_HEAP
storage doesn't use Alluxio anymore and instead uses Spark's internal off-heap store. See for example SPARK-16025.spark.sql.inMemoryColumnarStorage.*
properties.