-
Notifications
You must be signed in to change notification settings - Fork 713
[#9622] improvement(docs): Add guide for Lance REST integration with Spark and Ray #9623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
|
This PR will be blocked by several bugs and improvement, the followings are: |
The first two have been resolved. |
All resolved and ready to review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 10 comments.
| The `LOCATION` clause in the `CREATE TABLE` statement is optional. When omitted, lance-spark automatically determines an appropriate storage location based on catalog properties. | ||
| For detailed information on location resolution logic, refer to the [Lakehouse Generic Catalog documentation](./lakehouse-generic-catalog.md#key-property-location). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this chapter belong to local storage? Also, as I remembered, external table must specify the location, am I right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, as I remembered, external table must specify the location, am I right?
I'm afraid not. According to the code base, whether it's an external or not, the calculation method is the same: If no location is specified in table level, it will try to use the location of its parent, like the schema or catalog. It seems that we only use the property external when droping or purging the table.
Lines 224 to 237 in c206c07
| public Table createTable( | |
| NameIdentifier ident, | |
| Column[] columns, | |
| String comment, | |
| Map<String, String> properties, | |
| Transform[] partitions, | |
| Distribution distribution, | |
| SortOrder[] sortOrders, | |
| Index[] indexes) | |
| throws NoSuchSchemaException, TableAlreadyExistsException { | |
| Schema schema = loadSchema(NameIdentifier.of(ident.namespace().levels())); | |
| String tableLocation = calculateTableLocation(schema, ident, properties); | |
| String format = properties.getOrDefault(Table.PROPERTY_TABLE_FORMAT, null); |
external table must specify the location
Do you mean users need to specify the location in the table explicitly, or do they set it on any of the catalog, schema, or table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
…Spark and Ray (#9623) ### What changes were proposed in this pull request? This pull request adds a new documentation page that explains how to integrate the Lance REST service with Spark and Ray using their respective connectors. The guide covers prerequisites, compatibility, example usage for both Spark and Ray, troubleshooting tips, and a compatibility matrix for versions. ### Why are the changes needed? For a better user experience Fix: #9622 ### Does this PR introduce _any_ user-facing change? N/A. ### How was this patch tested? N/A.
What changes were proposed in this pull request?
This pull request adds a new documentation page that explains how to integrate the Lance REST service with Spark and Ray using their respective connectors. The guide covers prerequisites, compatibility, example usage for both Spark and Ray, troubleshooting tips, and a compatibility matrix for versions.
Why are the changes needed?
For a better user experience
Fix: #9622
Does this PR introduce any user-facing change?
N/A.
How was this patch tested?
N/A.