I've created a modular system that streams voice over UDP between anything with a mic&speaker and an android phone. if the mic and the speaker are placed not so far from each other, anything recorded on the phone will be played back on the speaker and at the same time recorded and sent back to the phone thus creating an inevitable echo.
So far the problem is that the mic is recording at the time of playback, so if for example a code was sent to stop the recording when there's something important playing, the problem would be solved. this is either achieved by a push-to-talk button (which is not very suitable for a phone-call type situation), or an algorithm that senses the maximum amplitude and decides whether something important is being said or its just noise. the latter is easier to use but is quite prone to fault in noisy environments.
sorry about the long lecture but i'd like to know if you there's a more efficient/faultproof way to solve this problem.
P.S. i don't have enough processing power on the hardware side to remove the echo digitally nor do i know how to. i'm sure its possible to do it with some kind of analog filter as well but i have no idea how to do that as well.
EDIT: Thanks to mattm i now know i need something like AEC although it might not be the most efficient or even practical.
I'm using a WIFI-UART module (HLK-RM04) and it has a huge problem. there's a very significant delay when the module is converting 8bit audio samples to IP datagrams. for some reason this delay doesn't exist when it's unpacking the datagrams sent from android => it takes ~50ms to receive a sample from android but it takes ~650ms to receive one from the mic. since i'm using a sampling rate of 7200 that means the AEC algorithm must work on a data set of at least 4680 samples/sec to keep it "real-time".
This simplest solution is to use headsets or telephone handsets, so the sound output for one user is not fed back into the same user's microphone.
Your problem is solved by acoustic echo cancellation. You can see examples of digital implementations of acoustic echo cancellation in the Speex audio codec, which is older now, or the implementation in WebRTC. These implementations are definitely nontrivial.
I have no experience with analog acoustic echo cancellation.
If you cannot do acoustic echo cancellation on one party (non-Android), the experience for the other party (Android) will be poor.