Skip to content

用户脚本:优化extractName#367

Open
QuantumNingen wants to merge 1 commit intoJoeanAmier:masterfrom
QuantumNingen:master
Open

用户脚本:优化extractName#367
QuantumNingen wants to merge 1 commit intoJoeanAmier:masterfrom
QuantumNingen:master

Conversation

@QuantumNingen
Copy link
Copy Markdown

@QuantumNingen QuantumNingen commented Apr 9, 2026

  • 优化用户脚本的extractName
  • 修改的文件:static/XHS-Downloader.js
  • 优化后的文件名格式:XHS_作者_原标题_发布日期

Summary by Sourcery

增强 XHS 用户脚本的文件名生成功能,使其在标准化格式中包含作者和发布日期。

新功能:

  • 为 XHS 内容生成格式为 XHS_作者_原标题_发布日期 的下载文件名,并在信息缺失时提供备用方案。

改进:

  • 提高文件名提取的健壮性:支持弹窗页和详情页两种作者选择器、解析多种日期格式,并确保最终文件名中的字符对文件系统安全。
Original summary in English

Summary by Sourcery

Enhance the XHS user script filename generation to include author and publication date in a standardized format.

New Features:

  • Generate download filenames in the format XHS_作者_原标题_发布日期 for XHS content, including fallbacks when information is missing.

Enhancements:

  • Improve robustness of filename extraction by handling both popup and detail page author selectors, parsing various date formats, and ensuring filesystem-safe characters in the final name.

优化后的文件名格式:XHS_作者_原标题_发布日期
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai bot commented Apr 9, 2026

Reviewer's Guide

重构 XHS userscript 中的 extractName 函数,以生成更丰富且经过清洗的文件名,格式为 XHS_作者_原标题_发布日期,作者和日期从页面中获取,在缺失时使用安全的默认值进行回退。

更新后 extractName 的 DOM 交互时序图

sequenceDiagram
    actor User
    participant Userscript
    participant BrowserDOM

    User ->> Userscript: Trigger download
    Userscript ->> Userscript: extractName()

    Userscript ->> BrowserDOM: Read document.title
    BrowserDOM -->> Userscript: title

    Userscript ->> BrowserDOM: querySelector(.username)
    BrowserDOM -->> Userscript: usernameElement or null
    Userscript ->> BrowserDOM: querySelector(.author-info .name)
    BrowserDOM -->> Userscript: authorNameElement or null
    Userscript ->> Userscript: Sanitize and truncate author or use 未知

    Userscript ->> BrowserDOM: querySelector(.date)
    BrowserDOM -->> Userscript: dateElement or null
    Userscript ->> Userscript: Parse relative or absolute date
    Userscript ->> Userscript: Fallback to today if needed

    Userscript ->> Userscript: Derive baseName from title
    Userscript ->> Userscript: If empty, extract id from currentUrl or use 未命名

    Userscript ->> Userscript: Build XHS_author_baseNameOrId_date
    Userscript ->> Userscript: Remove illegal filename characters
    Userscript -->> User: Safe, enriched filename
Loading

File-Level Changes

Change Details Files
增强文件名生成逻辑,使其包含作者、截断后的标题以及发布日期,并具备健壮的回退机制和清洗处理。
  • 将原本基于标题生成的变量重命名为 baseName,保留其清洗用正则表达式,并将其截断至 40 个字符,为作者和日期预留空间。
  • 新增 getAuthor 辅助函数,依次查询多个选择器(.username.author-info .name),对文本进行 trim 和清洗,将长度限制为 15 个字符,缺失时回退为 \'未知\'
  • 新增 getPublishDate 辅助函数,读取 .date 元素,对“天前”“昨天”等相对时间描述进行规范化,支持 YYYY-MM-DDMM-DD 格式,在解析失败时回退为当天的 ISO 日期。
  • 当基础标题为空时保留从 URL 中提取 ID 的回退逻辑,若未找到 ID 则使用 \'未命名\',并按 XHS_作者_原标题_日期 的格式拼装最终文件名。
  • 对拼装完成的文件名再次执行清洗,去除文件路径中无效的字符。
static/XHS-Downloader.js

Tips and commands

Interacting with Sourcery

  • 触发新的审核: 在 pull request 中评论 @sourcery-ai review
  • 继续讨论: 直接回复 Sourcery 的审核评论。
  • 从审核评论生成 GitHub issue: 在某条审核评论下让 Sourcery 创建 issue,或直接回复该评论 @sourcery-ai issue 以从该评论创建 issue。
  • 生成 pull request 标题: 在 pull request 标题中任意位置写上 @sourcery-ai 即可随时生成标题。你也可以在 pull request 中评论 @sourcery-ai title 来(重新)生成标题。
  • 生成 pull request 摘要: 在 pull request 正文任意位置写上 @sourcery-ai summary,即可在你指定的位置生成 PR 摘要。你也可以在 pull request 中评论 @sourcery-ai summary 来(重新)生成摘要。
  • 生成 reviewer's guide: 在 pull request 中评论 @sourcery-ai guide,即可随时(重新)生成 reviewer's guide。
  • 一次性解决所有 Sourcery 评论: 在 pull request 中评论 @sourcery-ai resolve,以标记所有 Sourcery 评论为已解决。如果你已经处理完所有评论且不想再看到它们,这会很有用。
  • 清除所有 Sourcery 审核: 在 pull request 中评论 @sourcery-ai dismiss,以清除所有现有的 Sourcery 审核结果。如果你想从头开始一次新的审核,这尤其有用——别忘了再评论 @sourcery-ai review 触发新审核!

Customizing Your Experience

访问你的 dashboard 以:

  • 启用或禁用审核特性,例如 Sourcery 自动生成的 pull request 摘要、reviewer's guide 等。
  • 修改审核语言。
  • 添加、移除或编辑自定义审核指令。
  • 调整其他审核设置。

Getting Help

Original review guide in English

Reviewer's Guide

Refactors the extractName function in the XHS userscript to produce richer, sanitized filenames in the format XHS_作者_原标题_发布日期, deriving author and date from the page and falling back to safe defaults when necessary.

Sequence diagram for updated extractName DOM interactions

sequenceDiagram
    actor User
    participant Userscript
    participant BrowserDOM

    User ->> Userscript: Trigger download
    Userscript ->> Userscript: extractName()

    Userscript ->> BrowserDOM: Read document.title
    BrowserDOM -->> Userscript: title

    Userscript ->> BrowserDOM: querySelector(.username)
    BrowserDOM -->> Userscript: usernameElement or null
    Userscript ->> BrowserDOM: querySelector(.author-info .name)
    BrowserDOM -->> Userscript: authorNameElement or null
    Userscript ->> Userscript: Sanitize and truncate author or use 未知

    Userscript ->> BrowserDOM: querySelector(.date)
    BrowserDOM -->> Userscript: dateElement or null
    Userscript ->> Userscript: Parse relative or absolute date
    Userscript ->> Userscript: Fallback to today if needed

    Userscript ->> Userscript: Derive baseName from title
    Userscript ->> Userscript: If empty, extract id from currentUrl or use 未命名

    Userscript ->> Userscript: Build XHS_author_baseNameOrId_date
    Userscript ->> Userscript: Remove illegal filename characters
    Userscript -->> User: Safe, enriched filename
Loading

File-Level Changes

Change Details Files
Enhance filename generation logic to include author, truncated title, and publish date with robust fallbacks and sanitization.
  • Rename the original title-derived variable to baseName, keep its sanitization regex, and truncate it to 40 characters to reserve space for author and date.
  • Introduce a getAuthor helper that queries multiple selectors (.username or .author-info .name), trims and sanitizes the text, limits it to 15 characters, and falls back to '未知' when absent.
  • Introduce a getPublishDate helper that reads the .date element, normalizes relative descriptions like '天前' and '昨天', supports both YYYY-MM-DD and MM-DD formats, and falls back to today’s ISO date when parsing fails.
  • Retain the URL ID fallback when the base title is empty, replacing it with '未命名' if no ID is found, and assemble the final filename in the format XHS_作者_原标题_日期.
  • Apply a final sanitization pass on the assembled filename to strip characters invalid for file paths.
static/XHS-Downloader.js

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - 我在这里给出一些高层次的反馈:

  • getPublishDate 中的日期解析逻辑目前对若干中文相对日期格式做了硬编码,并且在很多情况下会回退到今天的日期;可以考虑将 new Date().toISOString().slice(0, 10) 这一回退逻辑集中到一个地方,并让解析过程更显式一些,以避免意外格式被静默地映射为今天。
  • 作者信息提取会截断到 15 个字符,并替换一组特定字符;可以考虑将「清理(sanitize)」和截断逻辑抽取到一个带有命名常量的小工具函数里,这样以后调整允许的长度或字符集合时会更易维护。
  • 你现在在构建文件名各部分(author/baseName)时做了一次清理,在最终生成文件名时又做了一次;可以考虑把清理逻辑集中到单一位置,以确保规则一致,避免略有差异的规则或重复工作。
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The date parsing logic in getPublishDate currently hardcodes several Chinese-relative formats and falls back to today’s date in many cases; consider centralizing the `new Date().toISOString().slice(0, 10)` fallback and making the parsing more explicit so unexpected formats don’t silently map to today.
- The author extraction truncates to 15 characters and replaces a specific set of characters; it might be clearer to extract the sanitization and truncation into a small helper with named constants so future adjustments to allowed length or characters are easier to maintain.
- You now sanitize the filename both when building parts (author/baseName) and again at the end; consider ensuring the sanitization is applied consistently in a single place to avoid slightly different rules or redundant work.

Sourcery 对开源项目是免费的——如果你觉得我们的代码审查有帮助,请考虑分享一下 ✨
帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈改进后续的审查。
Original comment in English

Hey - I've left some high level feedback:

  • The date parsing logic in getPublishDate currently hardcodes several Chinese-relative formats and falls back to today’s date in many cases; consider centralizing the new Date().toISOString().slice(0, 10) fallback and making the parsing more explicit so unexpected formats don’t silently map to today.
  • The author extraction truncates to 15 characters and replaces a specific set of characters; it might be clearer to extract the sanitization and truncation into a small helper with named constants so future adjustments to allowed length or characters are easier to maintain.
  • You now sanitize the filename both when building parts (author/baseName) and again at the end; consider ensuring the sanitization is applied consistently in a single place to avoid slightly different rules or redundant work.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The date parsing logic in getPublishDate currently hardcodes several Chinese-relative formats and falls back to today’s date in many cases; consider centralizing the `new Date().toISOString().slice(0, 10)` fallback and making the parsing more explicit so unexpected formats don’t silently map to today.
- The author extraction truncates to 15 characters and replaces a specific set of characters; it might be clearer to extract the sanitization and truncation into a small helper with named constants so future adjustments to allowed length or characters are easier to maintain.
- You now sanitize the filename both when building parts (author/baseName) and again at the end; consider ensuring the sanitization is applied consistently in a single place to avoid slightly different rules or redundant work.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant