Skip to content

caUtils create-ngrams throws database exception "Out of range value for column 'seq'" #1836

@arrlee

Description

@arrlee

Running the create-ngrams task of caUtils ends with an exception:

/var/www/providence/support/bin# ./caUtils create-ngrams
CollectiveAccess 2.0 (204/GIT) Utilities
(c) 2013-2025 Whirl-i-Gig

Processing 0102                                                                               2.0% 4423/254376 ETC: 04m:42s. Elapsed: 05s [|                              ]
Processing 0102000201020000000201000102028948484801000202010048020102010001000202020101530...2.0% 4424/254376 ETC: 04m:42s. Elapsed: 05s [|                              ]
Processing 111111111111111111111111111111111111111111111111111111111111111111                4.0% 10237/254376 ETC: 04m:22s. Elapsed: 11s [▩|                             ]
Processing 111111111111111111111111111111111111111111111111111111111111111111111111111111...4.0% 10238/254376 ETC: 04m:22s. Elapsed: 11s [▩|                             ]
Processing 46.00                                                                            18.0% 46469/254376 ETC: 04m:19s. Elapsed: 58s [▩▩▩▩▩|                         ]
Processing 46.0631980306938.91962412515646.0639238508798.91962412515646.0639238508798.92...18.0% 46470/254376 ETC: 04m:19s. Elapsed: 58s [▩▩▩▩▩|                         ]
Processing 46.80                                                                            18.0% 46509/254376 ETC: 04m:19s. Elapsed: 58s [▩▩▩▩▩|                         ]
Processing 46.8010724116529.812885446777146.8009989691299.815117686164746.8013478202219....18.0% 46510/254376 ETC: 04m:19s. Elapsed: 58s [▩▩▩▩▩|                         ]
Processing 47.4782725285728.3021869767226                                                   18.0% 46847/254376 ETC: 04m:21s. Elapsed: 59s [▩▩▩▩▩|                         ]
Processing 47.4813788315898.307278667577747.4816978687688.30739660558247.4816181096558.3...18.0% 46848/254376 ETC: 04m:21s. Elapsed: 59s [▩▩▩▩▩|                         ]
Processing 47.50                                                                            18.0% 46857/254376 ETC: 04m:21s. Elapsed: 59s [▩▩▩▩▩|                         ]
Processing 47.5019677412378.727194754896547.5019749892628.727194754896547.5019749892628....18.0% 46858/254376 ETC: 04m:21s. Elapsed: 59s [▩▩▩▩▩|                         ]PHP Fatal error:  Uncaught DatabaseException: Out of range value for column 'seq' at row 49 in /var/www/providence/app/lib/Db/mysqli.php:358
Stack trace:
#0 /var/www/providence/app/lib/Db/DbStatement.php(150): Db_mysqli->execute()
#1 /var/www/providence/app/lib/Db.php(261): DbStatement->executeWithParamsAsArray()
#2 /var/www/providence/app/lib/Utils/CLIUtils/Search.php(150): Db->query()
#3 /var/www/providence/support/bin/caUtils(171): CLIUtils::create_ngrams()
#4 {main}
  thrown in /var/www/providence/app/lib/Db/mysqli.php on line 358

I do have a lot of scanned and OCRed books (mostly the first pages including index). A large share of them are antique, so OCR does not work well every time. This might contribute to a lot of mangled words in the index.
My installation is based on providence 2.0.9

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions