Hey team, I am trying to run this command datahub docker ingest-sample-data but I am getting this error
Pipeline finished with at least 1 failures; produced 0 events in 0 seconds.```
I am using CLI for ingestion
DataHub CLI version: 0.14.0.3
Models: bundled
Python version: 3.9.0
anyone knows how to fix this problem?
Hey there! Make sure your message includes the following information if relevant, so we can help more effectively!
Are you using UI or CLI for ingestion?
Which DataHub version are you using? (e.g. 0.12.0)
What data source(s) are you integrating with DataHub? (e.g. BigQuery)
<@U06TM7M40RK> Can you help with the questions in this thread?
Please continue to tag RunLLM (e.g <@U06TM7M40RK>) to have it continue generating responses! We’re experimenting with a new AI assistant for questions in the <#CUMUWQU66> channel to help you track down helpful resources more quickly. Keep in mind that responses may not always be spot-on; feel free to ask it follow-up questions and don’t forget to provide its responses a or so other folks in the Community know if you’re still looking for help on the topic and to help us improve the assistant!
To help you effectively with the error “Did not find a registered class for c” while trying to ingest sample data using datahub docker ingest-sample-data, I’ll need a bit more information:
Are you using the UI or CLI for ingestion?
Which DataHub version are you using? (e.g., 0.12.0)?
What data source(s) are you integrating with DataHub? (e.g., BigQuery)?
Based on the error message you provided, it seems like the DataHub ingestion pipeline is not recognizing a specific source type, which is causing the ingestion to fail. This can happen if the necessary plugin for the source type is not installed or configured correctly.
Here are some steps to troubleshoot and resolve this issue:
Ensure Proper Installation:
Make sure you have installed the necessary DataHub plugins. Depending on your setup, you might need to install the plugin using the following command:
pip install 'acryl-datahub[<source_type>]'
Check Plugin Installation:
Verify that the plugin is loaded by running:
datahub check plugins --verbose
Ensure that the required plugin is listed.
Set Up the Ingestion Recipe:
Make sure your ingestion recipe is correctly configured to use the appropriate source. Here is an example of what your recipe might look like:
source:
type: <source_type>
config:
# Add your source configuration here
sink:
type: datahub-rest
config:
server: "http://localhost:8080"
token: "<your-datahub-token>"
Run the Ingestion:
Execute the ingestion command with the configured recipe:
datahub ingest -c ./path_to_your_recipe.yml
If you have followed these steps and are still encountering issues, please let me know, and I can provide more detailed troubleshooting steps.
Would you like more detailed instructions on any of these steps?
Starting ingestion...
←[32m←[2m[2024-08-29 11:12:18,381]←[0m INFO ←[0m ←[34m←[2m{datahub.ingestion.run.pipeline:272}←[0m - Sink configured successfully. DataHubRestEmitter: configured to talk to <http://localhost:8080>←[0m
←[32m←[2m[2024-08-29 11:12:18,458]←[0m INFO ←[0m ←[34m←[2m{datahub.ingestion.run.pipeline:296}←[0m - Source configured successfully.←[0m
←[32m←[2m[2024-08-29 11:12:18,568]←[0m ←[31mERROR ←[0m ←[34m←[2m{datahub.ingestion.run.pipeline:737}←[0m - ←[31mIngestion pipeline threw an uncaught exception←[0m
Traceback (most recent call last):
File "C:\Users\Nawal\AppData\Local\Programs\Python\Python39\lib\site-packages\datahub\ingestion\run\pipeline.py", line 461, in run
for wu in itertools.islice(
File "C:\Users\Nawal\AppData\Local\Programs\Python\Python39\lib\site-packages\datahub\ingestion\api\source_helpers.py", line 147, in auto_workunit_reporter
for wu in stream:
File "C:\Users\Nawal\AppData\Local\Programs\Python\Python39\lib\site-packages\datahub\ingestion\source\file.py", line 212, in get_workunits_internal
for f in self.get_filenames():
File "C:\Users\Nawal\AppData\Local\Programs\Python\Python39\lib\site-packages\datahub\ingestion\source\file.py", line 197, in get_filenames
fs_class = fs_registry.get(schema)
File "C:\Users\Nawal\AppData\Local\Programs\Python\Python39\lib\site-packages\datahub\ingestion\api\registry.py", line 172, in get
raise KeyError(f"Did not find a registered class for {key}")
KeyError: 'Did not find a registered class for c'
←[32m←[2m[2024-08-29 11:12:18,583]←[0m ←[31mERROR ←[0m ←[34m←[2m{datahub.ingestion.run.pipeline:738}←[0m - ←[31mPipeline Error: Ingestion pipeline raised an unexpected exception!Pipeline Error: Ingestion pipeline raised an unexpected exception!: 'Did not find a registered class for c'←[0m
Pipeline finished with at least 1 failures; produced 0 events in 0 seconds.```