Skip to content

工具池 CSV文件NaN值重格式 (有问题) tools6 #125

@zanguixuan3

Description

@zanguixuan3

CSV文件NaN值重格式 (有问题)

2025-12-31 01:33:31 | INFO | data_engine.utils.logger_utils:144 - Create logger ID 3 with loglevel: INFO, export to /data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/output/log/tool_reformat_csv_nan_value_preprocess_internal_time_20251231013331.txt
2025-12-31 01:33:31 | INFO | data_engine.core.executor_tools:52 - Preparing tool...
2025-12-31 01:33:31 | INFO | data_engine.tools.base_tool:44 - Setting up data ingester...
2025-12-31 01:33:31 | INFO | data_engine.ingester.csghub_ingester:30 - Using dataset_path: /data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/input, repo:longrui/tools6, branch:main
2025-12-31 01:33:31 | INFO | data_engine.tools.base_tool:55 - Preparing exporter...
2025-12-31 01:33:31 | INFO | data_engine.core.executor_tools:59 - Launching tool...
2025-12-31 01:33:31 | INFO | data_engine.ingester.csghub_ingester:41 - model_id:longrui/tools6
2025-12-31 01:33:31 | INFO | data_engine.ingester.csghub_ingester:43 - endpoint:http://modelhub.cmr-co.com
2025-12-31 01:33:31 | INFO | data_engine.ingester.csghub_ingester:44 - 入参:repo_id:longrui/tools6, repo_type:dataset, revision:main, cache_dir:/data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/input, endpoint:http://modelhub.cmr-co.com, token:b2bc8452d426461d8e4aac51b82fdebc

Downloading .gitattributes: 0%| | 0.00/2.34k [00:00<?, ?B/s]
Downloading .gitattributes: 100%|##########| 2.34k/2.34k [00:00<00:00, 2.87MB/s]

Downloading README.md: 0%| | 0.00/25.0 [00:00<?, ?B/s]
Downloading README.md: 100%|##########| 25.0/25.0 [00:00<00:00, 35.9kB/s]

Downloading demo.csv: 0%| | 0.00/172 [00:00<?, ?B/s]
Downloading demo.csv: 100%|##########| 172/172 [00:00<00:00, 181kB/s]
2025-12-31 01:33:32 | INFO | data_engine.ingester.csghub_ingester:54 - result: /data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/input, _src_path: /data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/input
2025-12-31 01:33:32 | INFO | data_engine.tools.base_tool:95 - Data ingested from /data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/input
_accelerator 5555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555
2025-12-31 01:33:32 | DEBUG | data_engine.tools.base_tool:137 - Op [reformat_csv_nan_value_preprocess_internal] running with number of procs:3
2025-12-31 01:33:32 | INFO | data_engine.tools.base_tool:109 - Processing tool...

Generating train split: 0 examples [00:00, ? examples/s]
Generating train split: 4 examples [00:00, 164.08 examples/s]

Creating json from Arrow format: 0%| | 0/1 [00:00<?, ?ba/s]
Creating json from Arrow format: 100%|##########| 1/1 [00:00<00:00, 220.67ba/s]
_accelerator -5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5-5
2025-12-31 01:33:33 | INFO | data_engine.tools.base_tool:114 - Tool are done in 1.369s.
2025-12-31 01:33:33 | INFO | data_engine.tools.base_tool:121 - Exporting dataset to somewhere...
2025-12-31 01:33:33 | INFO | data_engine.exporter.csghub_exporter:97 - Start to upload /data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/output/_df_dataset.jsonl/_data to repo: longrui/tools6 with branch: main
2025-12-31 01:33:33 | INFO | data_engine.exporter.csghub_exporter:200 - repo longrui/tools6 all branches: ['main', 'refs-convert-parquet']
2025-12-31 01:33:33 | INFO | data_engine.exporter.csghub_exporter:153 - Start to push /data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/output/_df_dataset.jsonl/_data to repo: longrui/tools6 with branch: v1,user_name: longrui, token: b2bc8452d426461d8e4aac51b82fdebc
2025-12-31 01:33:35 | INFO | data_engine.exporter.csghub_exporter:166 - Done push /data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/output/_df_dataset.jsonl/_data to repo: longrui/tools6 with branch: v1
2025-12-31 01:33:35 | INFO | data_engine.exporter.csghub_exporter:169 - Remove /data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/output/_git
2025-12-31 01:33:35 | INFO | data_engine.exporter.csghub_exporter:172 - Remove /data/dataflow/CSV文件NaN值重格式ools6_d88f451f-3e5b-4b9c-8294-a718c88d7010/output/_df_dataset.jsonl/_data
2025-12-31 01:33:35 | WARNING | data_server.job.JobExecutor:127 - Job 113 still in PROCESSING state in finally block, marking as FAILED

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    P0bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions