Skip to content

Citation inconsistent when indexing using Azure AI Search. #1110

@hfaouaz

Description

@hfaouaz

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [x ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Create an indexer using Azure Search Service along with skillset to vectorize the document ( pdf and/or doc ).
Make sure that the fields on the index. matches the Document feeds ( content, embedding, sourcepage, sourcefile, category)

Any log messages given by the failure

Expected/desired behavior

Although the sourcepage is populated, It would be nice that the citation can still refer back to only use the sourcefile only. Since we are not using the predocs.py

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)
macOS

azd version?

run azd version and copy paste here.
azd version 1.5.1 (commit 3856d1e98281683b8d112e222c0a7c7b3e148e96)

Versions

1.5.1

Mention any other details that might be useful

I am trying to leverage the Azure AI Search capabilties, and rely less on building my own PdfParser/FileStratefgy. I got it working but my citation are inconsistent. sometime I get snippet of the content , sometimes I get info1.txt and they both have broken links.
Reference
#1080


Thanks! We'll be in touch soon.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions