We recently switched our azure functions durable functions based app from a dedicated s1/standard app service plan to dynamic y1 plan to same money and now we are getting a common error:
"A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond."
this happens after about an hour of the app running. The exceptions comes from a svcutil generated wcf client. I'm fairly certain this is related to the limitation of socket connections from a consumption function app vs a "dedicated" app plan as described at https://learn.microsoft.com/en-us/azure/azure-functions/functions-scale#service-limits but not totally convinced because i do NOT see the log message "Host thresholds exceeded: Connections" listed at https://learn.microsoft.com/en-us/azure/azure-functions/manage-connections#connection-limit
our client is actually a wrapper around a dozen wcf clients instantiated on our wrappers construction. the wrapper is registed with di as a singleton
builder.Services.AddSingleton<IWrapperClient, OurSoapClient>();
public OurSoapClient(
IMemoryCache memoryCache,
IOptions<Options> options,
ILogger<OurSoapClient> log
)
{
this.options = options.Value;
this.memoryCache = memoryCache;
this.log = log;
this.metaClient = new Meta.MetaWebServiceClient(
Meta.MetaWebServiceClient.EndpointConfiguration.MetaWebServicePort,
this.options.MetaHref
);
this.wmsClient = new Wms.WmsWebServiceClient(
Wms.WmsWebServiceClient.EndpointConfiguration.WmsWebServicePort,
this.options.WmsHref
);
this.wmsStageItemsClient = new Wms.Stage.Items.WmsWebServiceClient(
Wms.Stage.Items.WmsWebServiceClient.EndpointConfiguration.WmsWebServicePort,
this.options.WmsHref
);
this.wmsReceiptClient = new Wms.Stage.ExpectedReceipts.WmsWebServiceClient(
Wms.Stage.ExpectedReceipts.WmsWebServiceClient.EndpointConfiguration.WmsWebServicePort,
this.options.WmsHref
);
this.wmsStageRmaClient = new Wms.Stage.Rma.WmsWebServiceClient(
Wms.Stage.Rma.WmsWebServiceClient.EndpointConfiguration.WmsWebServicePort,
this.options.WmsHref
);
this.wmsStageShipmentsClient = new Wms.Stage.Shipments.WmsWebServiceClient(
Wms.Stage.Shipments.WmsWebServiceClient.EndpointConfiguration.WmsWebServicePort,
this.options.WmsHref
);
this.wmsUpdateShipmentsClient = new Wms.Updates.ShippingResults.WmsWebServiceClient(
Wms.Updates.ShippingResults.WmsWebServiceClient.EndpointConfiguration.WmsWebServicePort,
this.options.WmsHref
);
this.wmsUpdatesReceivingResultsClient = new Wms.Updates.ReceivingResults.WmsWebServiceClient(
Wms.Updates.ReceivingResults.WmsWebServiceClient.EndpointConfiguration.WmsWebServicePort,
this.options.WmsHref
);
this.wmsUpdatesInventoryAdjustmentClient = new Wms.Updates.InventoryAdjustments.WmsWebServiceClient(
Wms.Updates.InventoryAdjustments.WmsWebServiceClient.EndpointConfiguration.WmsWebServicePort,
this.options.WmsHref
);
this.wmsInboundOrderClient = new Wms.Inbound.CurrentAndHistory.WmsWebServiceClient(
Wms.Inbound.CurrentAndHistory.WmsWebServiceClient.EndpointConfiguration.WmsWebServicePort,
this.options.WmsHref
);
this.wmsOutboundOrderClient = new Wms.Outbound.CurrentAndHistory.WmsWebServiceClient(
Wms.Outbound.CurrentAndHistory.WmsWebServiceClient.EndpointConfiguration.WmsWebServicePort,
this.options.WmsHref
);
this.wmsInboundOrderDetailsClient = new Wms.Inbound.CurrentAndHistoryDetails.WmsWebServiceClient(
Wms.Inbound.CurrentAndHistoryDetails.WmsWebServiceClient.EndpointConfiguration.WmsWebServicePort,
this.options.WmsHref
);
this.wmsOutboundOrderDetailsClient = new Wms.Outbound.CurrentAndHistoryDetails.WmsWebServiceClient(
Wms.Outbound.CurrentAndHistoryDetails.WmsWebServiceClient.EndpointConfiguration.WmsWebServicePort,
this.options.WmsHref
);
}
switching back to standard app service plan seems to make this go away. i'm fairly certain durable functions isn't a cause here, but just to be clear all the calls to the client happen from Orchestrator or Activity functions...we see the same failure errors in both function types.
One anecdote i've noticed repeated is the errors seem to occur just after a second OurWrapperClient is instantiated (which instantiates all the wcf clients again)...since it's a singleton this must be the azure functions control plane spinning up another instance of my app
so a couple of questions:
Using app insights i noticed that the takes about an hour thing corresponded to my app switching host instances around that time. eventually i started to see that on deploys it would fail right away..ie got a "bad" host. opened up a MS support case they remoted into a bad isntance and found they could not TCP ping from that host.
Each webspace you are assigned makes requests from a pool of IPs, i suspect my targets WAF was blocking some of these IPs for whatever reason. Switching to a new region which guaranteed a new webspace (they're assigned on created, but are region specific) made the problem go away.
did find https://github.com/dotnet/runtime/issues/35508 during this which seemed similar