dnscentosload-balancinground-robin

Could we use DNS round robin with nscd's dns cache?


I try to use DNS round robin with nscd's dns cache.

But I am not convinced about the belows.

  1. nscd respect the dns record ttl at its dns reply

  2. the traffic from clients with nscd are distributed equally to servers behind domain name

Is it possible to use DNS round robin with nscd?


Solution

  • Summary

    In detail

    Environment

    enter image description here

    #
    # /etc/nscd.conf
    #
    # An example Name Service Cache config file. This file is needed by nscd.
    #
    # Legal entries are:
    #
    # logfile <file>
    # debug-level <level>
    # threads <initial #threads to use>
    # max-threads <maximum #threads to use>
    # server-user <user to run server as instead of root>
    # server-user is ignored if nscd is started with -S parameters
    # stat-user <user who is allowed to request statistics>
    # reload-count unlimited|<number>
    # paranoia <yes|no>
    # restart-interval <time in seconds>
    #
    # enable-cache <service> <yes|no>
    # positive-time-to-live <service> <time in seconds>
    # negative-time-to-live <service> <time in seconds>
    # suggested-size <service> <prime number>
    # check-files <service> <yes|no>
    # persistent <service> <yes|no>
    # shared <service> <yes|no>
    # max-db-size <service> <number bytes>
    # auto-propagate <service> <yes|no>
    #
    # Currently supported cache names (services): passwd, group, hosts
    #
    
    
    # logfile /var/log/nscd.log
    # threads 6
    # max-threads 128
    server-user nscd
    # stat-user nocpulse
    debug-level 0
    # reload-count 5
    paranoia no
    # restart-interval 3600
    
    enable-cache passwd yes
    positive-time-to-live passwd 600
    negative-time-to-live passwd 20
    suggested-size passwd 211
    check-files passwd yes
    persistent passwd yes
    shared passwd yes
    max-db-size passwd 33554432
    auto-propagate passwd yes
    
    enable-cache group yes
    positive-time-to-live group 3600
    negative-time-to-live group 60
    suggested-size group 211
    check-files group yes
    persistent group yes
    shared group yes
    max-db-size group 33554432
    auto-propagate group yes
    
    enable-cache hosts yes
    positive-time-to-live hosts 300
    negative-time-to-live hosts 20
    suggested-size hosts 211
    check-files hosts yes
    persistent hosts yes
    shared hosts yes
    max-db-size hosts 33554432
    

    What experiments I did

    1. sending traffic to test-nscd.apps.com with TTL 1 ~ 60s from 1 locust workers. And checking traffic distributed at PM1, PM2
    2. sending traffic to test-nscd.apps.com with TTL 1 from 1 ~ 60 locust workers. And checking traffic distributed at PM1, PM2
    3. sending traffic to test-nscd.apps.com with TTL 1 from 1 ~ 60 locust workers using keepalive. And checking traffic distributed at PM1, PM2

    The test results

    1. sending traffic to test-nscd.apps.com with TTL 1 ~ 60s from 1 locust workers and checking traffic distributed at PM1, PM2

    enter image description here

    14:37:55.116675 IP (tos 0x80, ttl 49, id 41538, offset 0, flags [none], proto UDP (17), length 111)
    10.230.167.65.domain > test-client.39956: [udp sum ok] 9453 q: A? test-nscd.apps.com. 2/0/0 test-nscd.apps.com. [1m] A 10.130.248.64, test-nscd.apps.com. [1m] A 10.130.248.63 (83)
    --
    14:39:10.121451 IP (tos 0x80, ttl 49, id 20047, offset 0, flags [none], proto UDP (17), length 111)
    10.230.167.65.domain > test-client.55173: [udp sum ok] 6722 q: A? test-nscd.apps.com. 2/0/0 test-nscd.apps.com. [1m] A 10.130.248.63, test-nscd.apps.com. [1m] A 10.130.248.64 (83)
    --
    14:40:25.120127 IP (tos 0x80, ttl 49, id 28851, offset 0, flags [none], proto UDP (17), length 111)
    10.230.167.65.domain > test-client.39461: [udp sum ok] 40481 q: A? test-nscd.apps.com. 2/0/0 test-nscd.apps.com. [1m] A 10.130.248.63, test-nscd.apps.com. [1m] A 10.130.248.64 (83)
    --
    

    enter image description here

    16:14:04.359901 IP (tos 0x80, ttl 49, id 39510, offset 0, flags [none], proto UDP (17), length 111)
    10.230.167.65.domain >test-client.51466: [udp sum ok] 43607 q: A? test-nscd.apps.com. 2/0/0 test-nscd.apps.com. [5s] A 10.130.248.63, test-nscd.apps.com. [5s] A 10.130.248.64 (83)
    --
    16:14:19.361964 IP (tos 0x80, ttl 49, id 3196, offset 0, flags [none], proto UDP (17), length 111)
    10.230.167.65.domain >test-client.39370: [udp sum ok] 62519 q: A? test-nscd.apps.com. 2/0/0 test-nscd.apps.com. [5s] A 10.130.248.63, test-nscd.apps.com. [5s] A 10.130.248.64 (83)
    --
    16:14:34.364359 IP (tos 0x80, ttl 49, id 27647, offset 0, flags [none], proto UDP (17), length 111)
    10.230.167.65.domain >test-client.49659: [udp sum ok] 51890 q: A? test-nscd.apps.com. 2/0/0 test-nscd.apps.com. [5s] A 10.130.248.64, test-nscd.apps.com. [5s] A 10.130.248.63 (83)
    --
    

    enter image description here

    15:45:04.141762 IP (tos 0x80, ttl 49, id 30678, offset 0, flags [none], proto UDP (17), length 111)
    10.230.167.65.domain >test-client.35411: [udp sum ok] 63073 q: A?test-nscd.apps.com. 2/0/0test-nscd.apps.com. [15s] A 10.130.248.63,test-nscd.apps.com. [15s] A 10.130.248.64 (83)
    --
    15:45:34.191159 IP (tos 0x80, ttl 49, id 48496, offset 0, flags [none], proto UDP (17), length 111)
    10.230.167.65.domain >test-client.52441: [udp sum ok] 24183 q: A?test-nscd.apps.com. 2/0/0test-nscd.apps.com. [15s] A 10.130.248.63,test-nscd.apps.com. [15s] A 10.130.248.64 (83)
    --
    15:46:04.192905 IP (tos 0x80, ttl 49, id 32793, offset 0, flags [none], proto UDP (17), length 111)
    10.230.167.65.domain >test-client.49875: [udp sum ok] 59065 q: A?test-nscd.apps.com. 2/0/0test-nscd.apps.com. [15s] A 10.130.248.63,test-nscd.apps.com. [15s] A 10.130.248.64 (83)
    --
    

    enter image description here

    16:14:04.359901 IP (tos 0x80, ttl 49, id 39510, offset 0, flags [none], proto UDP (17), length 111)
    10.230.167.65.domain > test-client.51466: [udp sum ok] 43607 q: A?test-nscd.apps.com. 2/0/0test-nscd.apps.com. [5s] A 10.130.248.63,test-nscd.apps.com. [5s] A 10.130.248.64 (83)
    --
    16:14:19.361964 IP (tos 0x80, ttl 49, id 3196, offset 0, flags [none], proto UDP (17), length 111)
    10.230.167.65.domain > test-client.com.39370: [udp sum ok] 62519 q: A?test-nscd.apps.com. 2/0/0test-nscd.apps.com. [5s] A 10.130.248.63,test-nscd.apps.com. [5s] A 10.130.248.64 (83)
    --
    16:14:34.364359 IP (tos 0x80, ttl 49, id 27647, offset 0, flags [none], proto UDP (17), length 111)
    10.230.167.65.domain > test-client.com.49659: [udp sum ok] 51890 q: A?test-nscd.apps.com. 2/0/0test-nscd.apps.com. [5s] A 10.130.248.64,test-nscd.apps.com. [5s] A 10.130.248.63 (83)
    --
    

    enter image description here

    16:43:27.814701 IP (tos 0x80, ttl 49, id 28956, offset 0, flags [none], proto UDP (17), length 111)
    10.230.167.65.domain > test-client.49891: [udp sum ok] 22634 q: A?test-nscd.apps.com. 2/0/0 test-nscd.apps.com. [1s] A 10.130.248.63,test-nscd.apps.com. [1s] A 10.130.248.64 (83)
    --
    16:43:42.816721 IP (tos 0x80, ttl 49, id 27128, offset 0, flags [none], proto UDP (17), length 111)
    10.230.167.65.domain > test-client.34490: [udp sum ok] 37589 q: A?test-nscd.apps.com. 2/0/0test-nscd.apps.com. [1s] A 10.130.248.63,test-nscd.apps.com. [1s] A 10.130.248.64 (83)
    --
    16:43:57.842106 IP (tos 0x80, ttl 49, id 60723, offset 0, flags [none], proto UDP (17), length 111)
    10.230.167.65.domain > test-client.55185: [udp sum ok] 1139 q: A?test-nscd.apps.com. 2/0/0test-nscd.apps.com. [1s] A 10.130.248.63,test-nscd.apps.com. [1s] A 10.130.248.64 (83)
    

    2. sending traffic to test-nscd.apps.com with TTL 1 from 1 ~ 100 locust workers and checking traffic distributed at PM1, PM2

    enter image description here

    3. sending traffic to test-nscd.apps.com with TTL 1 from 1 ~ 100 locust workers using keepalive and checking traffic distributed at PM1, PM2

    enter image description here

    4. (Comparison experiment) sending traffic to test-nscd.apps.com which is bound to machine JVM(JVM has its own dns caching). And checking traffic distributed at PM1, PM2

    enter image description here

    enter image description here

    enter image description here

    enter image description here

    Conclusion