Troubleshooting 'Connection Refused' Error in GMS Deployment

Original Slack Thread

Hi,
We deployed GMS using HELM & datahub prebuilt image version: v0.10.5.
we were getting below “connection refused” error. After google search we found (https://www.linen.dev/s/datahubspace/t/9721413/hello-i-am-trying-to-update-datahub-from-0-9-5-to-0-10-0-i-r)
Following possible solution from above url, we deployed datahub upgrade v0.10.5 via Helm. Update system Job ran successfully but we’re still seeing same below error.
In Upgrade Job, we use “DATAHUB_GMS_HOST = ”, DATAHUB_GMS_PORT = “8080”

Could you please help us to figure out root cause?
***** Error from GMS Logs *************
2023-11-17 14:37:21,588 [R2 Nio Event Loop-1-4] WARN c.l.r.t.h.c.c.ChannelPoolLifecycle:139 - Failed to create channel, remote=localhost/127.0.0.1:8080
io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:8080
Caused by: java.net.ConnectException: Connection refused
at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:777)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:337)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:776)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at java.base/java.lang.Thread.run(Thread.java:829)
2023-11-17 14:37:21,590 [pool-17-thread-1] ERROR c.d.m.ingestion.IngestionScheduler:244 - Failed to retrieve ingestion sources! Skipping updating schedule cache until next refresh. start: 0, count: 30
com.linkedin.r2.RemoteInvocationException: com.linkedin.r2.RemoteInvocationException: Failed to get response from server for URI http://localhost:8080/entities


<@U02G4B6ADL6> <@U03MF8MU5P0>

In my experience, error message like these are usually subsequent faults (could be Kafka problems, connection issues with the DB or elastic, etc.). Can you provide the full log?

Have you solved it? I encountered the same problem.

your upgrade is not complete or successful, gms is just waiting upgrade to finish. this thread might help you
https://datahubspace.slack.com/archives/C029A3M079U/p1689793000749939

Thank you. I started it with the official non-authentication method, and then started it with the authentication method. The problem was solved. The reason is that I annotated the content of datahub-upgrade in the authentication startup configuration, because if it is not annotated, the task will fail to start, which may be the reason for my configuration.