Troubleshooting JSONDecodeError in DataHub CLI Version 0.12.0

Original Slack Thread

Hi team, I am getting requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0) while trying to run anything from CLI.

DataHub CLI version: 0.12.0

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Which DataHub version are you using? (e.g. 0.12.0)

  2. Please post any relevant error logs on the thread!

Not really sure what’s happening - <@U04N9PYJBEW> Any ideas?

Hmm, I’m not sure what’s going on here either. Can you run datahub --version successfully? What about datahub init? If you can, make sure that you’re pointing to the correctly Datahub instance with that command

DataHub CLI version: 0.12.0
pointing to server: https://datahub.ripplinginternal.com/gms

Hi folks, I’m running into the same situation, wondering do you have any updates on this issue?
I’m able to get the GMS config from browser with GMS server url: https://{dathaub_host}/config , however I cannot make datahbu CLI working with the same url. it always ran into JSONDecodeError.
DataHub CLI version: 0.12.0

And you have as url https://{datahub_host} in your .datahubenv file? cc <@U01GZEETMEZ>

yes, this is my .datahubenv definition, I didn’t enable authentication tokens.

  server: https://{dathaub_host}
  token: '' ```

Can you post the full logs with the exact error. Thanks for your patience

Traceback (most recent call last):
  File "/home/kli/Github/av/venv/lib/python3.10/site-packages/requests/models.py", line 971, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/kli/Github/av/venv/lib/python3.10/site-packages/datahub/entrypoints.py", line 188, in main
    sys.exit(datahub(standalone_mode=False, **kwargs))
  File "/home/kli/Github/av/venv/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home/kli/Github/av/venv/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home/kli/Github/av/venv/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/kli/Github/av/venv/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/kli/Github/av/venv/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/kli/Github/av/venv/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home/kli/Github/av/venv/lib/python3.10/site-packages/datahub/upgrade/upgrade.py", line 398, in async_wrapper
    loop.run_until_complete(run_func_check_upgrade())
  File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/home/kli/Github/av/venv/lib/python3.10/site-packages/datahub/upgrade/upgrade.py", line 385, in run_func_check_upgrade
    ret = await the_one_future
  File "/home/kli/Github/av/venv/lib/python3.10/site-packages/datahub/upgrade/upgrade.py", line 378, in run_inner_func
    return await loop.run_in_executor(
  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/kli/Github/av/venv/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 448, in wrapper
    raise e
  File "/home/kli/Github/av/venv/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 397, in wrapper
    res = func(*args, **kwargs)
  File "/home/kli/Github/av/venv/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 443, in list_runs
    rows = parse_restli_response(response)
  File "/home/kli/Github/av/venv/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 372, in parse_restli_response
    response_json = response.json()
  File "/home/kli/Github/av/venv/lib/python3.10/site-packages/requests/models.py", line 975, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)```

If you send do a curl curl http://{host}/config , what does that show? I suspect that it’s connecting to the frontend container and not GMS

<@U01GZEETMEZ> What would be the port for GMS in Managed DataHub?

With Acryl cloud, you can use https://<customer>.<http://acryl.io/gms|acryl.io/gms> as the GMS URL. No port number is needed

Fantasic <@U01GZEETMEZ> That worked

<@U01GZEETMEZ>This is my .datahubenv definition, I didn’t enable authentication tokens.

  server: https://{dathaub_host}/api/gms/
  token: '&lt;api token from UI&gt;' ```
when I do curl `http://{host}/config` I get
```&lt;html&gt;
&lt;head&gt;&lt;title&gt;301 Moved Permanently&lt;/title&gt;&lt;/head&gt;
&lt;body&gt;
&lt;center&gt;&lt;h1&gt;301 Moved Permanently&lt;/h1&gt;&lt;/center&gt;
&lt;/body&gt;
&lt;/html&gt;```

When I try
datahub exists --urn "urn:li:dataset:(urn:li:dataPlatform:hive,SampleHiveDataset,PROD)"
I get
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
any idea what the fix is?

It looks like you’re connecting to the incorrect host then - do you have a proxy or some other redirect sitting in front of your datahub instance?

Overall this looks like something specific to your setup, so I’m not sure how much help I can give. Overall, the /config endpoint should give a json object back e.g.

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   624    0   624    0     0  66242      0 --:--:-- --:--:-- --:--:--  304k
{
  "models": {},
  "managedIngestion": {
    "defaultCliVersion": "something",
    "enabled": true
  },
  "timeZone": "GMT",
  "datasetUrnNameCasing": false,
  "datahub": {
    "serverType": "dev"
  },
  "baseUrl": "<http://localhost:9002>",
  "patchCapable": true,
  "versions": {
    "linkedin/datahub": {
      "version": "...",
      "commit": "c5142f1847838e7ff2ed894458912abb65f2b686"
    }
  },
  "statefulIngestionCapable": true,
  "supportsImpactAnalysis": true,
  "telemetry": {
    "enabledCli": true,
    "enabledIngestion": false
  },
  "retention": "true",
  "noCode": "true"
}```

This is fixed. You are right, I was pointing to the wrong end point