Conversation
Conflicts: h2o-core/src/main/java/water/fvec/CStrChunk.java h2o-core/src/main/java/water/fvec/NewChunk.java
|
The only catch with putting this in the main h2o-dev stream is lines 666-670 of ParsedData2.java. If the Enum maps max out, this code spills over to using strings instead of inserting NA's (the old behavior). Since string/enum unification between chunks isn't included in this patch, if anything tests that behavior it will fail. If that isn't an issue, then merge away. Commenting out the above mentioned lines is handy for testing. All items are then read in as strings, and basic string vector functionality can be tested. Putting them back in, lets you see the standard enum behavior again (minus overflow NAs). I hope to finish the string/enum unification tonight. |
|
If you're that close, I might wait on Anand's pull & take 'em all. fyi, we've been hacking the "gradle test" and "make check" paths - both Cliff On 7/7/2014 5:26 PM, Brandon wrote:
|
|
Tomas - I'm poking at work between setting up for this trip. We can handle the H2O1 integration later (if ever; lets work out the Thanks On 7/7/2014 5:26 PM, Brandon wrote:
|
|
Ok I merged it in but I have two comments: a) the chunk is not supported directly by the serialized bytes as other chunks are (uses two separate arrays internally and both are allocated/copied during read() ) so it will double the required memory. b) do we want the interface to return String (need to copy the memory for each accessed row) or should we use the ValueString as is used in the parser to avoid memcopy/allocation? |
|
The interface was driven by the first requirement (Mahout), in which case On Tue, Jul 8, 2014 at 3:53 PM, tomasnykodym notifications@github.com
|
|
Good catch. Yes, that's a fail. So yes, please, flip CStrChunk to a pile-o-bytes, and have the interface Brandon, are you up to doing this? Can you coordinate w/Tomas & Anand Thanks, On 7/8/2014 3:53 PM, tomasnykodym wrote:
|
|
Yes, But No. Internally we'll use ValueString. We'll look at a double-storage option (both strings & bytes) if String Cliff On 7/8/2014 4:07 PM, Anand Avati wrote:
|
# This is the 1st commit message: PUBDEV-3793-python-api-misc. Added unit test for h2o.init(). # This is the commit message #2: PUBDEV-3793-python-api-h2o-cluster-commands. Added pyunit test for h2o.ls(). # This is the commit message #3: PUBDEV-3797: H2O cluster apis. Added pyunit skeleton test files. # This is the commit message #4: PUBDEV-3797: api from H2O cluster. Completed h2o.init() tests. # This is the commit message #5: PUBDEV-3797: H2O MODULE API tests. pyunit tests API complete. # This is the commit message #6: PUBDEV-3797: H2O module api tests. Added more tests. # This is the commit message #7: PUBDEV-3797: API test for python client in H2O Module. Added all tests. Ready for review. # This is the commit message #8: PUBDEV-3797: Python API H2O Module. Incorporated Pasha comments and minor code cleanup. # This is the commit message #9: PUBDEV-3797: Python API tests for H2O Module. Fixed pyunit tests to use more general commands and hopefully pass all the tests. # This is the commit message #10: PUBDEV-3797: api for H2O Module. Fixed h2oinit.py failure. Thanks Pasha. # This is the commit message #1: PUBDEV-3797: api test for H2O Module. Clean up h2oinit.py test. # This is the commit message #2: PUBDEV-3797: test H2O Module API. Added more print for pyunit_h2oinit.py.
String based chunk support. Set contains two patches