Skip to content

Commit 4ee5673

Browse files
committed
Merge branch 'rockset-master'
2 parents 337ef7c + 9a6461f commit 4ee5673

File tree

984 files changed

+17694
-10547
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

984 files changed

+17694
-10547
lines changed

.gitignore

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,8 @@ trace_analyzer
5252
trace_analyzer_test
5353
block_cache_trace_analyzer
5454
.DS_Store
55+
.vs
56+
.vscode
5557

5658
java/out
5759
java/target
@@ -81,3 +83,7 @@ fbcode
8183
travis-build/
8284
buckifier/*.pyc
8385
buckifier/__pycache__
86+
87+
compile_commands.json
88+
.vscode
89+
.clangd

CMakeLists.txt

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -665,6 +665,7 @@ set(SOURCES
665665
util/random.cc
666666
util/rate_limiter.cc
667667
util/slice.cc
668+
util/file_checksum_helper.cc
668669
util/status.cc
669670
util/string_util.cc
670671
util/thread_local.cc
@@ -906,15 +907,15 @@ if(NOT WIN32 OR ROCKSDB_INSTALL_ON_WINDOWS)
906907
)
907908
endif()
908909

909-
add_subdirectory(third-party/gtest-1.8.1/fused-src/gtest)
910-
add_library(testharness STATIC
911-
test_util/testharness.cc)
912-
target_link_libraries(testharness gtest)
913-
914910
# Tests are excluded from Release builds
915911
CMAKE_DEPENDENT_OPTION(WITH_TESTS "build with tests" ON
916912
"CMAKE_BUILD_TYPE STREQUAL Debug" OFF)
917913
if(WITH_TESTS)
914+
add_subdirectory(third-party/gtest-1.8.1/fused-src/gtest)
915+
add_library(testharness STATIC
916+
test_util/testharness.cc)
917+
target_link_libraries(testharness gtest)
918+
918919
set(TESTS
919920
cache/cache_test.cc
920921
cache/lru_cache_test.cc
@@ -1025,6 +1026,7 @@ if(WITH_TESTS)
10251026
util/bloom_test.cc
10261027
util/coding_test.cc
10271028
util/crc32c_test.cc
1029+
util/defer_test.cc
10281030
util/dynamic_bloom_test.cc
10291031
util/file_reader_writer_test.cc
10301032
util/filelock_test.cc
@@ -1033,6 +1035,7 @@ if(WITH_TESTS)
10331035
util/random_test.cc
10341036
util/rate_limiter_test.cc
10351037
util/repeatable_thread_test.cc
1038+
util/slice_test.cc
10361039
util/slice_transform_test.cc
10371040
util/timer_queue_test.cc
10381041
util/thread_list_test.cc

HISTORY.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,39 @@
11
# Rocksdb Change Log
2+
## Unreleased
3+
### Bug Fixes
4+
* Upgraded version of bzip library (1.0.6 -> 1.0.8) used with RocksJava to address potential vulnerabilities if an attacker can manipulate compressed data saved and loaded by RocksDB (not normal). See issue #6703.
5+
6+
## 6.8.1 (03/30/2020)
7+
### Behavior changes
8+
* Since RocksDB 6.8.0, ttl-based FIFO compaction can drop a file whose oldest key becomes older than options.ttl while others have not. This fix reverts this and makes ttl-based FIFO compaction use the file's flush time as the criterion. This fix also requires that max_open_files = -1 and compaction_options_fifo.allow_compaction = false to function properly.
9+
10+
## 6.8.0 (02/24/2020)
11+
### Java API Changes
12+
* Major breaking changes to Java comparators, toward standardizing on ByteBuffer for performant, locale-neutral operations on keys (#6252).
13+
* Added overloads of common API methods using direct ByteBuffers for keys and values (#2283).
14+
15+
### Bug Fixes
16+
* Fix incorrect results while block-based table uses kHashSearch, together with Prev()/SeekForPrev().
17+
* Fix a bug that prevents opening a DB after two consecutive crash with TransactionDB, where the first crash recovers from a corrupted WAL with kPointInTimeRecovery but the second cannot.
18+
* Fixed issue #6316 that can cause a corruption of the MANIFEST file in the middle when writing to it fails due to no disk space.
19+
* Add DBOptions::skip_checking_sst_file_sizes_on_db_open. It disables potentially expensive checking of all sst file sizes in DB::Open().
20+
* BlobDB now ignores trivially moved files when updating the mapping between blob files and SSTs. This should mitigate issue #6338 where out of order flush/compaction notifications could trigger an assertion with the earlier code.
21+
* Batched MultiGet() ignores IO errors while reading data blocks, causing it to potentially continue looking for a key and returning stale results.
22+
* `WriteBatchWithIndex::DeleteRange` returns `Status::NotSupported`. Previously it returned success even though reads on the batch did not account for range tombstones. The corresponding language bindings now cannot be used. In C, that includes `rocksdb_writebatch_wi_delete_range`, `rocksdb_writebatch_wi_delete_range_cf`, `rocksdb_writebatch_wi_delete_rangev`, and `rocksdb_writebatch_wi_delete_rangev_cf`. In Java, that includes `WriteBatchWithIndex::deleteRange`.
23+
* Assign new MANIFEST file number when caller tries to create a new MANIFEST by calling LogAndApply(..., new_descriptor_log=true). This bug can cause MANIFEST being overwritten during recovery if options.write_dbid_to_manifest = true and there are WAL file(s).
24+
25+
### Performance Improvements
26+
* Perfom readahead when reading from option files. Inside DB, options.log_readahead_size will be used as the readahead size. In other cases, a default 512KB is used.
27+
28+
### Public API Change
29+
* The BlobDB garbage collector now emits the statistics `BLOB_DB_GC_NUM_FILES` (number of blob files obsoleted during GC), `BLOB_DB_GC_NUM_NEW_FILES` (number of new blob files generated during GC), `BLOB_DB_GC_FAILURES` (number of failed GC passes), `BLOB_DB_GC_NUM_KEYS_RELOCATED` (number of blobs relocated during GC), and `BLOB_DB_GC_BYTES_RELOCATED` (total size of blobs relocated during GC). On the other hand, the following statistics, which are not relevant for the new GC implementation, are now deprecated: `BLOB_DB_GC_NUM_KEYS_OVERWRITTEN`, `BLOB_DB_GC_NUM_KEYS_EXPIRED`, `BLOB_DB_GC_BYTES_OVERWRITTEN`, `BLOB_DB_GC_BYTES_EXPIRED`, and `BLOB_DB_GC_MICROS`.
30+
* Disable recycle_log_file_num when an inconsistent recovery modes are requested: kPointInTimeRecovery and kAbsoluteConsistency
31+
32+
### New Features
33+
* Added the checksum for each SST file generated by Flush or Compaction. Added sst_file_checksum_func to Options such that user can plugin their own SST file checksum function via override the FileChecksumFunc class. If user does not set the sst_file_checksum_func, SST file checksum calculation will not be enabled. The checksum information inlcuding uint32_t checksum value and a checksum function name (string). The checksum information is stored in FileMetadata in version store and also logged to MANIFEST. A new tool is added to LDB such that user can dump out a list of file checksum information from MANIFEST (stored in an unordered_map).
34+
* `db_bench` now supports `value_size_distribution_type`, `value_size_min`, `value_size_max` options for generating random variable sized value. Added `blob_db_compression_type` option for BlobDB to enable blob compression.
35+
* Replace RocksDB namespace "rocksdb" with flag "ROCKSDB_NAMESPACE" which if is not defined, defined as "rocksdb" in header file rocksdb_namespace.h.
36+
237
## 6.7.3 (03/18/2020)
338
### Bug Fixes
439
* Fix a data race that might cause crash when calling DB::GetCreationTimeOfOldestFile() by a small chance. The bug was introduced in 6.6 Release.
@@ -29,11 +64,13 @@
2964
* Fixed an issue where the thread pools were not resized upon setting `max_background_jobs` dynamically through the `SetDBOptions` interface.
3065
* Fix a bug that can cause write threads to hang when a slowdown/stall happens and there is a mix of writers with WriteOptions::no_slowdown set/unset.
3166
* Fixed an issue where an incorrect "number of input records" value was used to compute the "records dropped" statistics for compactions.
67+
* Fix a regression bug that causes segfault when hash is used, max_open_files != -1 and total order seek is used and switched back.
3268

3369
### New Features
3470
* It is now possible to enable periodic compactions for the base DB when using BlobDB.
3571
* BlobDB now garbage collects non-TTL blobs when `enable_garbage_collection` is set to `true` in `BlobDBOptions`. Garbage collection is performed during compaction: any valid blobs located in the oldest N files (where N is the number of non-TTL blob files multiplied by the value of `BlobDBOptions::garbage_collection_cutoff`) encountered during compaction get relocated to new blob files, and old blob files are dropped once they are no longer needed. Note: we recommend enabling periodic compactions for the base DB when using this feature to deal with the case when some old blob files are kept alive by SSTs that otherwise do not get picked for compaction.
3672
* `db_bench` now supports the `garbage_collection_cutoff` option for BlobDB.
73+
* Introduce ReadOptions.auto_prefix_mode. When set to true, iterator will return the same result as total order seek, but may choose to use prefix seek internally based on seek key and iterator upper bound.
3774
* MultiGet() can use IO Uring to parallelize read from the same SST file. This featuer is by default disabled. It can be enabled with environment variable ROCKSDB_USE_IO_URING.
3875

3976
## 6.6.2 (01/13/2020)

Makefile

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -506,6 +506,7 @@ TESTS = \
506506
data_block_hash_index_test \
507507
cache_test \
508508
corruption_test \
509+
slice_test \
509510
slice_transform_test \
510511
dbformat_test \
511512
fault_injection_test \
@@ -598,6 +599,7 @@ TESTS = \
598599
db_secondary_test \
599600
block_cache_tracer_test \
600601
block_cache_trace_analyzer_test \
602+
defer_test \
601603

602604
ifeq ($(USE_FOLLY_DISTRIBUTED_MUTEX),1)
603605
TESTS += folly_synchronization_distributed_mutex_test
@@ -1294,6 +1296,9 @@ corruption_test: db/corruption_test.o db/db_test_util.o $(LIBOBJECTS) $(TESTHARN
12941296
crc32c_test: util/crc32c_test.o $(LIBOBJECTS) $(TESTHARNESS)
12951297
$(AM_LINK)
12961298

1299+
slice_test: util/slice_test.o $(LIBOBJECTS) $(TESTHARNESS)
1300+
$(AM_LINK)
1301+
12971302
slice_transform_test: util/slice_transform_test.o $(LIBOBJECTS) $(TESTHARNESS)
12981303
$(AM_LINK)
12991304

@@ -1722,6 +1727,9 @@ block_cache_tracer_test: trace_replay/block_cache_tracer_test.o trace_replay/blo
17221727
block_cache_trace_analyzer_test: tools/block_cache_analyzer/block_cache_trace_analyzer_test.o tools/block_cache_analyzer/block_cache_trace_analyzer.o $(LIBOBJECTS) $(TESTHARNESS)
17231728
$(AM_LINK)
17241729

1730+
defer_test: util/defer_test.o $(LIBOBJECTS) $(TESTHARNESS)
1731+
$(AM_LINK)
1732+
17251733
#-------------------------------------------------
17261734
# make install related stuff
17271735
INSTALL_PATH ?= /usr/local
@@ -1801,9 +1809,9 @@ SHA256_CMD = sha256sum
18011809
ZLIB_VER ?= 1.2.11
18021810
ZLIB_SHA256 ?= c3e5e9fdd5004dcb542feda5ee4f0ff0744628baf8ed2dd5d66f8ca1197cb1a1
18031811
ZLIB_DOWNLOAD_BASE ?= http://zlib.net
1804-
BZIP2_VER ?= 1.0.6
1805-
BZIP2_SHA256 ?= a2848f34fcd5d6cf47def00461fcb528a0484d8edef8208d6d2e2909dc61d9cd
1806-
BZIP2_DOWNLOAD_BASE ?= https://downloads.sourceforge.net/project/bzip2
1812+
BZIP2_VER ?= 1.0.8
1813+
BZIP2_SHA256 ?= ab5a03176ee106d3f0fa90e381da478ddae405918153cca248e682cd0c4a2269
1814+
BZIP2_DOWNLOAD_BASE ?= https://sourceware.org/pub/bzip2
18071815
SNAPPY_VER ?= 1.1.8
18081816
SNAPPY_SHA256 ?= 16b677f07832a612b0836178db7f374e414f94657c138e6993cbfc5dcc58651f
18091817
SNAPPY_DOWNLOAD_BASE ?= https://github.com/google/snappy/archive

TARGETS

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,14 +26,13 @@ ROCKSDB_EXTERNAL_DEPS = [
2626
("lz4", None, "lz4"),
2727
("zstd", None),
2828
("tbb", None),
29-
("liburing", None, "uring"),
3029
("googletest", None, "gtest"),
3130
]
3231

3332
ROCKSDB_OS_DEPS = [
3433
(
3534
"linux",
36-
["third-party//numa:numa"],
35+
["third-party//numa:numa", "third-party//liburing:uring"],
3736
),
3837
]
3938

@@ -73,7 +72,6 @@ ROCKSDB_PREPROCESSOR_FLAGS = [
7372
"-DZSTD_STATIC_LINKING_ONLY",
7473
"-DGFLAGS=gflags",
7574
"-DTBB",
76-
"-DLIBURING",
7775

7876
# Added missing flags from output of build_detect_platform
7977
"-DROCKSDB_BACKTRACE",
@@ -285,6 +283,7 @@ cpp_library(
285283
"util/concurrent_task_limiter_impl.cc",
286284
"util/crc32c.cc",
287285
"util/dynamic_bloom.cc",
286+
"util/file_checksum_helper.cc",
288287
"util/hash.cc",
289288
"util/murmurhash.cc",
290289
"util/random.cc",
@@ -905,6 +904,13 @@ ROCKS_TESTS = [
905904
[],
906905
[],
907906
],
907+
[
908+
"defer_test",
909+
"util/defer_test.cc",
910+
"serial",
911+
[],
912+
[],
913+
],
908914
[
909915
"delete_scheduler_test",
910916
"file/delete_scheduler_test.cc",
@@ -1311,6 +1317,13 @@ ROCKS_TESTS = [
13111317
[],
13121318
[],
13131319
],
1320+
[
1321+
"slice_test",
1322+
"util/slice_test.cc",
1323+
"serial",
1324+
[],
1325+
[],
1326+
],
13141327
[
13151328
"slice_transform_test",
13161329
"util/slice_transform_test.cc",

buckifier/targets_cfg.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -32,14 +32,13 @@
3232
("lz4", None, "lz4"),
3333
("zstd", None),
3434
("tbb", None),
35-
("liburing", None, "uring"),
3635
("googletest", None, "gtest"),
3736
]
3837
3938
ROCKSDB_OS_DEPS = [
4039
(
4140
"linux",
42-
["third-party//numa:numa"],
41+
["third-party//numa:numa", "third-party//liburing:uring"],
4342
),
4443
]
4544
@@ -79,7 +78,6 @@
7978
"-DZSTD_STATIC_LINKING_ONLY",
8079
"-DGFLAGS=gflags",
8180
"-DTBB",
82-
"-DLIBURING",
8381
8482
# Added missing flags from output of build_detect_platform
8583
"-DROCKSDB_BACKTRACE",

build_tools/rocksdb-lego-determinator

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -414,7 +414,8 @@ STRESS_CRASH_TEST_COMMANDS="[
414414
'shell':'$SHM $DEBUG $NON_TSAN_CRASH make J=1 crash_test || $CONTRUN_NAME=crash_test $TASK_CREATION_TOOL',
415415
'user':'root',
416416
$PARSER
417-
}
417+
},
418+
$UPLOAD_DB_DIR,
418419
],
419420
$REPORT
420421
}
@@ -542,6 +543,7 @@ ASAN_CRASH_TEST_COMMANDS="[
542543
'user':'root',
543544
$PARSER
544545
},
546+
$UPLOAD_DB_DIR,
545547
],
546548
$REPORT
547549
}
@@ -634,6 +636,7 @@ UBSAN_CRASH_TEST_COMMANDS="[
634636
'user':'root',
635637
$PARSER
636638
},
639+
$UPLOAD_DB_DIR,
637640
],
638641
$REPORT
639642
}
@@ -751,6 +754,7 @@ TSAN_CRASH_TEST_COMMANDS="[
751754
'user':'root',
752755
$PARSER
753756
},
757+
$UPLOAD_DB_DIR,
754758
],
755759
$REPORT
756760
}
@@ -1059,5 +1063,6 @@ case $1 in
10591063
;;
10601064
*)
10611065
echo "Invalid determinator command"
1066+
exit 1
10621067
;;
10631068
esac

cache/cache_bench.cc

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ DEFINE_int32(erase_percent, 10,
4545

4646
DEFINE_bool(use_clock_cache, false, "");
4747

48-
namespace rocksdb {
48+
namespace ROCKSDB_NAMESPACE {
4949

5050
class CacheBench;
5151
namespace {
@@ -154,7 +154,7 @@ class CacheBench {
154154
}
155155

156156
bool Run() {
157-
rocksdb::Env* env = rocksdb::Env::Default();
157+
ROCKSDB_NAMESPACE::Env* env = ROCKSDB_NAMESPACE::Env::Default();
158158

159159
PrintEnv();
160160
SharedState shared(this);
@@ -257,7 +257,7 @@ class CacheBench {
257257
printf("----------------------------\n");
258258
}
259259
};
260-
} // namespace rocksdb
260+
} // namespace ROCKSDB_NAMESPACE
261261

262262
int main(int argc, char** argv) {
263263
ParseCommandLineFlags(&argc, &argv, true);
@@ -267,7 +267,7 @@ int main(int argc, char** argv) {
267267
exit(1);
268268
}
269269

270-
rocksdb::CacheBench bench;
270+
ROCKSDB_NAMESPACE::CacheBench bench;
271271
if (FLAGS_populate_cache) {
272272
bench.PopulateCache();
273273
}

cache/cache_test.cc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
#include "util/coding.h"
2121
#include "util/string_util.h"
2222

23-
namespace rocksdb {
23+
namespace ROCKSDB_NAMESPACE {
2424

2525
// Conversions between numeric keys/values and the types expected by Cache.
2626
static std::string EncodeKey(int k) {
@@ -765,7 +765,7 @@ INSTANTIATE_TEST_CASE_P(CacheTestInstance, CacheTest, testing::Values(kLRU));
765765
#endif // SUPPORT_CLOCK_CACHE
766766
INSTANTIATE_TEST_CASE_P(CacheTestInstance, LRUCacheTest, testing::Values(kLRU));
767767

768-
} // namespace rocksdb
768+
} // namespace ROCKSDB_NAMESPACE
769769

770770
int main(int argc, char** argv) {
771771
::testing::InitGoogleTest(&argc, argv);

cache/clock_cache.cc

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111

1212
#ifndef SUPPORT_CLOCK_CACHE
1313

14-
namespace rocksdb {
14+
namespace ROCKSDB_NAMESPACE {
1515

1616
std::shared_ptr<Cache> NewClockCache(
1717
size_t /*capacity*/, int /*num_shard_bits*/, bool /*strict_capacity_limit*/,
@@ -20,7 +20,7 @@ std::shared_ptr<Cache> NewClockCache(
2020
return nullptr;
2121
}
2222

23-
} // namespace rocksdb
23+
} // namespace ROCKSDB_NAMESPACE
2424

2525
#else
2626

@@ -41,7 +41,7 @@ std::shared_ptr<Cache> NewClockCache(
4141
#include "util/autovector.h"
4242
#include "util/mutexlock.h"
4343

44-
namespace rocksdb {
44+
namespace ROCKSDB_NAMESPACE {
4545

4646
namespace {
4747

@@ -756,6 +756,6 @@ std::shared_ptr<Cache> NewClockCache(
756756
capacity, num_shard_bits, strict_capacity_limit, metadata_charge_policy);
757757
}
758758

759-
} // namespace rocksdb
759+
} // namespace ROCKSDB_NAMESPACE
760760

761761
#endif // SUPPORT_CLOCK_CACHE

0 commit comments

Comments
 (0)