-
Notifications
You must be signed in to change notification settings - Fork 4
Description
CSV文件NaN值重格式 (有问题)
2025-12-31 01:33:31 | INFO | data_engine.utils.logger_utils:144 - Create logger ID 3 with loglevel: INFO, export to /data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/output/log/tool_reformat_csv_nan_value_preprocess_internal_time_20251231013331.txt
2025-12-31 01:33:31 | INFO | data_engine.core.executor_tools:52 - Preparing tool...
2025-12-31 01:33:31 | INFO | data_engine.tools.base_tool:44 - Setting up data ingester...
2025-12-31 01:33:31 | INFO | data_engine.ingester.csghub_ingester:30 - Using dataset_path: /data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/input, repo:longrui/tools6, branch:main
2025-12-31 01:33:31 | INFO | data_engine.tools.base_tool:55 - Preparing exporter...
2025-12-31 01:33:31 | INFO | data_engine.core.executor_tools:59 - Launching tool...
2025-12-31 01:33:31 | INFO | data_engine.ingester.csghub_ingester:41 - model_id:longrui/tools6
2025-12-31 01:33:31 | INFO | data_engine.ingester.csghub_ingester:43 - endpoint:http://modelhub.cmr-co.com
2025-12-31 01:33:31 | INFO | data_engine.ingester.csghub_ingester:44 - 入参:repo_id:longrui/tools6, repo_type:dataset, revision:main, cache_dir:/data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/input, endpoint:http://modelhub.cmr-co.com, token:b2bc8452d426461d8e4aac51b82fdebc
Downloading .gitattributes: 0%| | 0.00/2.34k [00:00<?, ?B/s]
Downloading .gitattributes: 100%|##########| 2.34k/2.34k [00:00<00:00, 2.87MB/s]
Downloading README.md: 0%| | 0.00/25.0 [00:00<?, ?B/s]
Downloading README.md: 100%|##########| 25.0/25.0 [00:00<00:00, 35.9kB/s]
Downloading demo.csv: 0%| | 0.00/172 [00:00<?, ?B/s]
Downloading demo.csv: 100%|##########| 172/172 [00:00<00:00, 181kB/s]
2025-12-31 01:33:32 | INFO | data_engine.ingester.csghub_ingester:54 - result: /data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/input, _src_path: /data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/input
2025-12-31 01:33:32 | INFO | data_engine.tools.base_tool:95 - Data ingested from /data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/input
_accelerator 5555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555
2025-12-31 01:33:32 | DEBUG | data_engine.tools.base_tool:137 - Op [reformat_csv_nan_value_preprocess_internal] running with number of procs:3
2025-12-31 01:33:32 | INFO | data_engine.tools.base_tool:109 - Processing tool...
Generating train split: 0 examples [00:00, ? examples/s]
Generating train split: 4 examples [00:00, 164.08 examples/s]
Creating json from Arrow format: 0%| | 0/1 [00:00<?, ?ba/s]
Creating json from Arrow format: 100%|##########| 1/1 [00:00<00:00, 220.67ba/s]
_accelerator -5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5
2025-12-31 01:33:33 | INFO | data_engine.tools.base_tool:114 - Tool are done in 1.369s.
2025-12-31 01:33:33 | INFO | data_engine.tools.base_tool:121 - Exporting dataset to somewhere...
2025-12-31 01:33:33 | INFO | data_engine.exporter.csghub_exporter:97 - Start to upload /data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/output/_df_dataset.jsonl/_data to repo: longrui/tools6 with branch: main
2025-12-31 01:33:33 | INFO | data_engine.exporter.csghub_exporter:200 - repo longrui/tools6 all branches: ['main', 'refs-convert-parquet']
2025-12-31 01:33:33 | INFO | data_engine.exporter.csghub_exporter:153 - Start to push /data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/output/_df_dataset.jsonl/_data to repo: longrui/tools6 with branch: v1,user_name: longrui, token: b2bc8452d426461d8e4aac51b82fdebc
2025-12-31 01:33:35 | INFO | data_engine.exporter.csghub_exporter:166 - Done push /data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/output/_df_dataset.jsonl/_data to repo: longrui/tools6 with branch: v1
2025-12-31 01:33:35 | INFO | data_engine.exporter.csghub_exporter:169 - Remove /data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/output/_git
2025-12-31 01:33:35 | INFO | data_engine.exporter.csghub_exporter:172 - Remove /data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/output/_df_dataset.jsonl/_data
2025-12-31 01:33:35 | WARNING | data_server.job.JobExecutor:127 - Job 113 still in PROCESSING state in finally block, marking as FAILED
