Troubleshooting DataHub Ingestion Error: 'Did not find a registered class for c'

Original Slack Thread

Hey team, I am trying to run this command datahub docker ingest-sample-data but I am getting this error

Pipeline finished with at least 1 failures; produced 0 events in 0 seconds.```
I am using CLI for ingestion
DataHub CLI version: 0.14.0.3
Models: bundled
Python version: 3.9.0
anyone knows how to fix this problem?

Hey there! :wave: Make sure your message includes the following information if relevant, so we can help more effectively!

  1. Are you using UI or CLI for ingestion?
  2. Which DataHub version are you using? (e.g. 0.12.0)
  3. What data source(s) are you integrating with DataHub? (e.g. BigQuery)

<@U06TM7M40RK> Can you help with the questions in this thread?


:information_source: Please continue to tag RunLLM (e.g <@U06TM7M40RK>) to have it continue generating responses! We’re experimenting with a new AI assistant for questions in the <#CUMUWQU66> channel to help you track down helpful resources more quickly. Keep in mind that responses may not always be spot-on; feel free to ask it follow-up questions and don’t forget to provide its responses a :thumbsup: or :thumbsdown: so other folks in the Community know if you’re still looking for help on the topic and to help us improve the assistant!

To help you effectively with the error “Did not find a registered class for c” while trying to ingest sample data using datahub docker ingest-sample-data, I’ll need a bit more information:

  1. Are you using the UI or CLI for ingestion?
  2. Which DataHub version are you using? (e.g., 0.12.0)?
  3. What data source(s) are you integrating with DataHub? (e.g., BigQuery)?

Based on the error message you provided, it seems like the DataHub ingestion pipeline is not recognizing a specific source type, which is causing the ingestion to fail. This can happen if the necessary plugin for the source type is not installed or configured correctly.

Here are some steps to troubleshoot and resolve this issue:

  1. Ensure Proper Installation:
    Make sure you have installed the necessary DataHub plugins. Depending on your setup, you might need to install the plugin using the following command:

    pip install 'acryl-datahub[<source_type>]'
    
  2. Check Plugin Installation:
    Verify that the plugin is loaded by running:

    datahub check plugins --verbose
    

    Ensure that the required plugin is listed.

  3. Set Up the Ingestion Recipe:
    Make sure your ingestion recipe is correctly configured to use the appropriate source. Here is an example of what your recipe might look like:

    source:
      type: <source_type>
      config:
        # Add your source configuration here
    sink:
      type: datahub-rest
      config:
        server: "http://localhost:8080"
        token: "<your-datahub-token>"
    
  4. Run the Ingestion:
    Execute the ingestion command with the configured recipe:

    datahub ingest -c ./path_to_your_recipe.yml
    

If you have followed these steps and are still encountering issues, please let me know, and I can provide more detailed troubleshooting steps.

Would you like more detailed instructions on any of these steps?

Sources:

c'←[0m Is this really what you see or there is more specific info about the class it is missing?

Hi <@UV14447EU>,

Here is the full error message:

Starting ingestion...
←[32m←[2m[2024-08-29 11:12:18,381]←[0m INFO    ←[0m ←[34m←[2m{datahub.ingestion.run.pipeline:272}←[0m - Sink configured successfully. DataHubRestEmitter: configured to talk to <http://localhost:8080>←[0m
←[32m←[2m[2024-08-29 11:12:18,458]←[0m INFO    ←[0m ←[34m←[2m{datahub.ingestion.run.pipeline:296}←[0m - Source configured successfully.←[0m
←[32m←[2m[2024-08-29 11:12:18,568]←[0m ←[31mERROR   ←[0m ←[34m←[2m{datahub.ingestion.run.pipeline:737}←[0m - ←[31mIngestion pipeline threw an uncaught exception←[0m
Traceback (most recent call last):
  File "C:\Users\Nawal\AppData\Local\Programs\Python\Python39\lib\site-packages\datahub\ingestion\run\pipeline.py", line 461, in run
    for wu in itertools.islice(
  File "C:\Users\Nawal\AppData\Local\Programs\Python\Python39\lib\site-packages\datahub\ingestion\api\source_helpers.py", line 147, in auto_workunit_reporter
    for wu in stream:
  File "C:\Users\Nawal\AppData\Local\Programs\Python\Python39\lib\site-packages\datahub\ingestion\source\file.py", line 212, in get_workunits_internal
    for f in self.get_filenames():
  File "C:\Users\Nawal\AppData\Local\Programs\Python\Python39\lib\site-packages\datahub\ingestion\source\file.py", line 197, in get_filenames
    fs_class = fs_registry.get(schema)
  File "C:\Users\Nawal\AppData\Local\Programs\Python\Python39\lib\site-packages\datahub\ingestion\api\registry.py", line 172, in get
    raise KeyError(f"Did not find a registered class for {key}")
KeyError: 'Did not find a registered class for c'
←[32m←[2m[2024-08-29 11:12:18,583]←[0m ←[31mERROR   ←[0m ←[34m←[2m{datahub.ingestion.run.pipeline:738}←[0m - ←[31mPipeline Error: Ingestion pipeline raised an unexpected exception!Pipeline Error: Ingestion pipeline raised an unexpected exception!: 'Did not find a registered class for c'←[0m
Pipeline finished with at least 1 failures; produced 0 events in 0 seconds.```